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NOVEL TGF-p PATHWAY GENES 

Background of the Invention 

The transforming growth factor-p (TGF-p) family of proteins consists of a 
5 number of related, but functionally distinct, proteins (Barnard, J. A. et al. (1990) 
Biochim, Biophys, Ada. 1032:79-87: Roberts, A. B. and Spom, M.B. eds. The 
fransforming Growth Factor-Ps in Peptide Growth Factors and Their Receptors. I. 
Handbook of Experimental Pharmacology, vol. 95/1 (Springer-Verlag, Berlin, 1990) 
419-472). One member of the TGF-p family of proteins, TGF-p 1, is a multifunctional 

1 0 cytokine with both growth promoting and inhibiting activities. Recently, TGF-P 1 has 
been found to play a role in modulating repair of vascular injuries such as restenosis 
lesions (Nikol, S. et al. (1992)7. Clin, hivest. 90:1582-1592) and atherosclerotic plaques 
(Kqjima,S.etal. (1991)7. Cell Bioi 1 13(6): 1439-1445. 

Members of the TGF-P family of proteins initiate cell signaling by binding to 

15 heteromeric receptor complexes of type 1 (TpRI) and type II (TpRII) serine/threonine 
kinase receptors (reviewed by N4assague, J. et al. (1994) Trends Cell Bioi 4:172-178; 
Miyazono, K. et al. (1994) y4c/v, Immunol. 55:181-220), Activation of this heteromeric 
receptor complex occurs when TGF-P binds to TpRII, which then recruits and 
phosphorylates TpRI. Activated TpRI then propagates the signal to downstream targets 

20 (Chen, F. and Weinberg, R.A. (1995) /^A^/l.? 92: 1565- 1569; Wrana, J.L. et al. (1994) 
/Vamre 370:341-347). 

Until recently, the proteins involved in the intracellular TGF-P signaling 
pathway were largely unknown. In 1995, however, a protein from Drosophila 
melanogcistc}\ named Mothers against c/pp ("MAD'*), was cloned and found to be 

25 required for cell signaling by the TGF-p family member decapentaplegic (dpp) 

(Sekelsky, J.J. et al. (1995) Genetics 139:1347-1358), Subsequently, cDNAs for four 
human homologues of the MAD protein, named hMADl-4 and now generally known as 
MAD-related (MADR) proteins, were isolated and at least two of which (hMAD-3 and 
hMAD-4) were characterized as effectors of TGF-P cellular responses (Zhang, Y. et al. 

30 (1996) Nature 383:168-172). hMAD-1 corresponds to MADRl, a tumor suppressor, 
whose inactivation may play a role in colorectal cancer (Eppert, K. et al. (1996) Cell 
86:543-552). hMAD-4 is identical to DPC4, a candidate tumor suppressor, whose 
inactivation may play a role in pancreatic and other human cancers (Hahn, S.A. et al. 
(1996) Science 271 :350-353). Once a cell is activated by a member of the TGF-P family 

35 of proteins, activated MADR proteins or complexes of MADR proteins may be 
translocated into the nucleus to function as a transcriptional activator(s). Thus, as 
members of the TGF-P family initiate a variety of beneficial effects on various cell 
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types, e.g., epithelial cell:^ and endothelial cells, it is desirable to modulate TGF-(i effect 
on such cells. One method of modulating TGF-p initiated cell function is to modulate 
the function of proteins, such as the MADR proteins, which are involved in propagatinu 
the TGF-p signal in the cell. 

Summary of the Invention 

This invention provides a novel nucleic acid molecule which encodes a protein, 
referred to herein as Endothelial MAD Interactor 1 ("EMIT') protein, which is capable 
of, for example, modulating the activity of proteins involved in the TGF-p signaling 
pathway to thereby modulate the effects of TGF-p on TGF-p responsive cells. Nucleic 
acid molecules encoding an EMI I protein are referred to herein as EMU nucleic acid 
molecules. In a preferred embodiment, the EMI 1 protein interacts with (e.g., binds to) a 
protein which is a member of the MADR family of proteins. Examples of such proteins 
include Drosophila MAD, human MADR6 (also known as the fchd534 gene product) 
and human MADR7 (also known as the fchd540 gene product). MADR6 and MADR7 
are described in United States Serial Numbers 08/599,654 and 08/799,910, respectively, 
the contents of which are expressly incorporated herein by reference. 

MADR6 and MADR7 proteins are expressed in endothelial cells, are known to 
interact with one another, and are up-rcgulated in endothelial cells in a model of shear 
stress conditions. It has also been found that MADR6 and MADR7 inhibit TGF-p 
signaling in endothelial cells. As TGF-P signaling of endothelial cells is involved in 
repair of vascular injuries and MADR6 and MADR7 have been found to inhibit this 
TGF-P initiated activity in endothelial cells, MADR6 and MADR7 are good targets for 
modulating TGF-P initiated repair of vascular injuries. The EMU protein of the present 
invention binds to MADR6 and MADR7 and modulates their activity. Thus, EMI 1 
molecules can be used to modulate TGF-p initiated repair of vascular injuries and thus 
to treat cardiovascular disorders. 

In addition, MADR proteins function in other cell types, e.g., epithelial cells, 
gut-derived epithelial cells such as epithelial cells of the pancreas and colon, to mediate 
TGF-P signaling. For example, MADRl mediates TGF-p tumor suppressor effects in 
gut-derived epithelial cells. Thus, proteins, such as EMU, which modulate the activity 
of MAORI, are also useful in the treatment of cancers, e.g., epithelial cell cancers such 
as colorectal carcinomas. 

Accordingly, one aspect of the invention pertains to isolated nucleic acid 
molecules (e.g., cDN As) comprising a nucleotide sequence encoding an EMI 1 protein or 
biologically active portions thereof, as well as nucleic acid fragments suitable as primers 
or hybridization probes for the detection of EMI I -encoding nucleic acid (e.g., mRNA). 
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In particularly preferred embodiments, the isolated nucleic acid molecule comprises the 
nucleotide sequence of SEQ ID NO: L the nucleotide sequence of the DNA insert of the 
plasmid deposited with ATCC as Accession Number 98375, or the coding region or a 
complement of either of these nucleotide sequences. In other particularly preferred 
5 embodiments, the isolated nucleic acid molecule of the invention comprises a nucleotide 
sequence which hybridizes to or is at least about 60-65%, preferably at least about 70- 
75%, more preferably at least about 80-85%, and even more preferably at least about 90- 
95% or more homologous to the nucleotide sequence shown in SEQ ID NO: 1, the 
nucleotide sequence of the DNA insert of the plasmid deposited with ATCG as 

10 Accession Number 98375, or a portion of either of these nucleotide sequences. In other 
preferred embodiments, the isolated nucleic acid molecule encodes the amino acid 
sequence of SEQ ID NO:2 or an iimino acid sequence encoded by the nucleotide 
sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number 
98375. The preferred EMI 1 proteins of the present invention also preferably possess at 

1 5 least one of the EMI 1 activities described herein. 

In another embodiment, the isolated nucleic acid molecule encodes a protein or 
portion thereof wherein the protein or portion thereof includes an amino acid sequence 
which is sufficiently homologous to an amino acid sequence of SEQ ID NO:2, e.g., 
sufficiently homologous to an amino acid sequence of SEQ ID NO:2 such that the 

20 protein or portion thereof maintains an EMI 1 activity. Preferably, the protein or portion 
thereof encoded by the nucleic acid molecule maintains the ability to modulate a TGF-(i 
response in a TGF-p responsive cell. In one embodiment, the protein encoded by the 
nucleic acid molecule is at least about 60-70%, preferably at least about 80-85%, and 
more preferably at least about 86, 88, 90%, and most preferably at least about 90-95% or 

25 more homologous to the amino acid sequence of SEQ ID NO:2 (e.g., the entire amino 
acid sequence of SEQ ID NO:2) or the amino acid sequence encoded by the nucleotide 
sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number 
98375. In another preferred embodiment, the protein is a full length human protein 
which is substantially homologous to the entire amino acid sequence of SEQ ID NO:2 

30 (encoded by the open reading frame shown in SEQ ID NO:3). 

In yet another embodiment, the isolated nucleic acid molecule is derived from a 
human and encodes a portion of a protein which includes a WW domain. Preferably, the 
WW domain encoded by the human nucleic acid molecule is at least about 55%, 
preferably at least about 60-65%, even more preferably at least about 70-75%, and most 

35 preferably at least about 80-90% or more homologous to the WW domain (i.e., amino 
acid residues 300-335) of SEQ ID NO:2 which is shown as a separate sequence 
designated SEQ ID NO:4. In still another embodiment, the nucleic acid molecule is a 
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nonniammalian molecule which encodes a WW domain. I^referably, the WW domain 
encoded by the nonmammalian nucleic acid is at least about 75%, more preferably at 
least about 80-85%. and most preferably at least about 90-95% or more homologous to 
SEQ ID NO:4. 

another preferred embodiment, the isolated nucleic acid molecule is derived 
from a human and encodes a protein (e.g.. an EMU fusion protein) which includes a 
WW domain which is at least about 55% or more homologous to SEQ ID NO:4 and has 
one or more of the following activities: 1 ) it can interact with (e.g., bind to) an MADR 
protein; 2) it can modulate the activity of an MADR protein; 3) it can interact with (e.g., 
0 bind to) a protein having a PY motif; 4) it can modulate the activity of a protein having a 
PY motif; and 5) it can modulate a TGF-fJ response in a TGF-p responsive cell (e.g., an 
epithelial cell or an endothelial cell) to, for example, beneficially affect the TGF-p 
responsive ceil. 

In another embodiment, the isolated nucleic acid molecule is at least 1 5 
5 nucleotides in length and hybridizes under stringent conditions to a nucleic acid 
molecule comprising the nucleotide sequence of SEQ ID NO: 1 or to the nucleotide 
sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number 
98375. Preferably, the isolated nucleic acid molecule corresponds to a naturally- 
occurring nucleic acid molecule. More preferably, the isolated nucleic acid encodes 
naturally-occurring human EMU or a biologically active portion thereof Moreover, 
given the disclosure herein of an EMU -encoding cDNA sequence (e.g.. SEQ ID NO:I ), 
antisense nucleic acid molecules (i.e.. molecules which are complementary to the coding 
strand of the EMIl cDNA sequence) are also provided by the invention. 

Another aspect of the invention pertains to vectors, e.g.. recombinant expression 
vectors, containing the nucleic acid molecules of the invention and host cells into which 
such vectors have been introduced. In one embodiment, such a host cell is used to 
produce EMIl protein by culturing the host cell in a suitable medium. If desired, the 
EMIl protein can be then isolated from the medium or the host cell. 

Yet another aspect of the invention pertains to transgenic nonhuman animals in 
which an EMIl gene has been introduced or altered. In one embodiment, the genome of 
the nonhuman animal has been altered by introduction of a nucleic acid molecule of the 
invention encoding EMI 1 as a transgene. In another embodiment, an endogenous EMI 1 
gene within the genome of the nonhuman animal has been altered, e.g.. functionally 
disrupted, by homologous recombination. 

Still another aspect of the invention pertains to an isolated EMI I protein or a 
portion, e.g.. a biologically active portion, thereof In a preferred embodiment, the 
isolated EMU protein or portion thereof can modulate a TGF-p respon.se in a TGF-p 
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responsive cell. In another preferred embodiment, the isolated EMI 1 protein or portion 
thereof is sufficiently homologous to an amino acid sequence of SEQ ID NO:2 such that 
the protein or portion thereof maintains the ability to modulate a TGF-P response in a 
TGF-p responsive cell. 

5 In one embodiment, the biologically active portion of the EMU protein includes 

a domain or motif, preferably a domain or motif which has an EMI 1 activity. The 
domain can be WW domain. If the active portion of the protein which comprises the 
WW domain is isolated or derived from a human, it is preferred that the WW domain be 
at least about 55%, preferably at least about 60-65%, even more preferably at least about 

10 70-75%, and most preferably at least about 80-90% or more homologous to SEQ ID 
NO:4. If the active portion of the protein which comprises the WW domain is isolated 
or derived from an animal which is not a mammal, it is preferred that the WW domain 
be at least about 75%, preferably at least about 80-85%, and most preferably at least 
about 90-95% or more homologous to SEQ ID NO:4. Preferably, the biologically active 

15 portion of the EMIl protein which includes a WW domain also has one of the following 
activities: 1) it can interact with (e.g., bind to) an MADR protein; 2) it can modulate the 
activity of an MADR protein; 3) it can interact with (e.g., bind to) a protein having a PY 
motif; 4) it can modulate the activity of a protein having a PY motif; and 5) it can 
modulate a TGF-P response in a TGF-P responsive cell (e.g., an epithelial cell or an 

20 endothelial cell) to, for example, beneficially affect the TGF-P responsive cell. 

The invention also provides an isolated preparation of an EMIl protein. In 
preferred embodiments, the EMI 1 protein comprises the amino acid sequence of SEQ ID 
NO:2 or an amino acid sequence encoded by the nucleotide sequence of the DNA insert 
of the plasmid deposited with ATCC as Accession Number 98375. In another preferred 

25 embodiment, the invention pertains to an isolated full length protein which is 

substantially homologous to the entire amino acid sequence of SEQ ID NO:2 (encoded 
by the open reading frame shown in SEQ ID NO: 3). In yet another embodiment, the 
protein is at least about 60-70%, preferably at least about 80-85%, and more preferably 
at least about 86, 88, 90%, and most preferably at least about 90-95% or more 

30 homologous to the entire amino acid sequence of SEQ ID NO:2. In other embodiments, 
the isolated EMI I protein comprises an amino acid sequence which is at least about 60- 
70% or more homologous to the amino acid sequence of SEQ ID NO:2 and has an one 
or more of the following activities: 1 ) it can interact with (e.g., bind to) to an MADR 
protein; 2) it can modulate the activity of an MADR protein; 3) it can interact with (e.g., 

35 bind to) a protein having a PY motif; 4) it can modulate the activity of a protein having a 
PY motif; and 5) it can modulate a TGF-p response in a TGF-P responsive cell (e.g., an 
epithelial cell or an endothelial cell) to, for example, beneficially affect the TGF-P 
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responsive cell. Alternatively, the isolated EMU protein can comprise an amino acid 
sequence which is encoded by a nucleotide sequence which hybridizes, e.g., hybridizes 
under stringent conditions, or is at least about 60-65%, preferably at least about 70-75%. 
more preferably at least about 80-85%, and even more preferably at least about 90-95% 
5 or more homologous to the nucleotide sequence of SEQ ID NO: I or the nucleotide 

sequence of the DNA insert of the plasmid deposited with ATCC as Accession Number 
98375. It is also preferred that the preferred fomis of EMU also have one or more of the 
EMI I activities described herein. 

The EMU protein (or polypeptide) or a biologically active portion thereof can be 
1 0 opcratively linked to a non-EMI 1 polypeptide to form a fusion protein. In addition, the 
EMU protein or a biologically active portion thereof can be incorporated into a 
pharmaceutical composition comprising the protein and a pharmaceutically acceptable 
carrier. 

The EMI I protein of the invention, or portions or fragments thereof, can be used 
1 5 to prepare anti-EMI I antibodies. Accordingly, the invention also provides an antigenic 
peptide of EMU which comprises at least 8 amino acid residues of the amino acid 
sequence shown in SEQ ID NO:2 and encompasses an epitope of EMU such that an 
antibody raised against the peptide forms a specific immune complex with EMU. 
Preferably, the antigenic peptide comprises at least 10 amino acid residues, more 
20 preferably at least 1 5 amino acid residues, even more preferably at least 20 amino acid 
residues, and most preferably at least 30 amino acid residues. The invention further 
provides an antibody that specifically binds EMI I. In one embodiment, the antibody is 
monoclonal. In another embodiment, the antibody is coupled to a detectable substance. 
In yet another embodiment, the antibody is incorporated into a pharmaceutical 
25 composition comprising the antibody and a pharmaceutically acceptable carrier. 

Another aspect of the invention pertains to methods for modulating a cell 
associated activity, e.g., proliferation or differentiation. Such methods include 
contacting the cell with an agent which modulates EMU protein activity or EMU 
nucleic acid expression such that a cell associated activity is altered relative to a cell 
30 associated activity (e.g., the same cell associated activity) of the cell in the absence of 
the agent. In a preferred embodiment, the cell is capable of responding to TGF-p 
through a signaling pathway involving an EMI 1 protein (e.g.. an epithelial cell or an 
endothelial cell). The agent which modulates EMU activity can be an agent which 
stimulates EMU protein activity or EMU nucleic acid expression. Examples of agents 
35 which stimulate EMU protein activity or EMU nucleic acid expression include small 
molecules, active EMI 1 proteins, and nucleic acids encoding EMU that have been 
introduced into the cell. Examples of agents which inhibit EMU activity or expression 
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include small molecules, antisense EMI I nucleic acid molecules, and antibodies thai 
specifically bind to EMI 1 . In a preferred embodiment, the cell is present within a 
subject and the agent is administered to the subject. 

The present invention also pertains to methods for treating subjects having 
5 various disorders. For example, the invention pertains to methods for treating a subject 
having a disorder characterized by aberrant EMI I protein activity or nucleic acid 
expression such as a cardiovascular disorder, e.g., atherosclerosis, or a proliferative 
disorder, e.g., a proliferative disorder characterized by uncontrolled proliferation of 
epithelial cells. These methods include administering to the subject an EMI 1 modulator 
10 (e.g., a small molecule) such that treatment of the subject occurs. 

In another embodiment, the invention pertains to methods for treating a subject 
having a cardiovascular disorder, e.g.. atherosclerosis, or a proliferative disorder, e.g., a 
proliferative disorder characterized by uncontrolled proliferation of epithelial cells, 
comprising administering to the subject an EMI 1 modulator such that treatment occurs. 

15 In other embodiments, the invention pertains to methods for treating a subject 

having a cardiovascular disorder or a proliferative disorder comprising administering to 
the subject an EMU protein or portion thereof such that treatment occurs. 
Cardiovascular and proliferative disorders can also be treated according to the invention 
by administering to the subject having the disorder a nucleic acid encoding an EMI 1 

20 protein or portion thereof such that treatment occurs. 

The invention also pertains to methods for detecting genetic lesions in a EMI 1 
gene, thereby determining if a subject with the lesioned gene is at risk for (or is 
predisposed to have) a disorder characterized by aberrant or abnormal EMIl nucleic acid 
expression or EMIl protein activity, e.g., a cardiovascular disorder or a proliferative 

25 disorder. In preferred embodiments, the methods include detecting, in a sample of cells 
from the subject, the presence or absence of a genetic lesion characterized by an 
alteration affecting the integrity of a gene encoding an EMIl protein, or the 
misexpression of the EMIl gene. 

Another aspect of the invention pertains to methods for detecting the presence of 

30 EMI 1 in a biological sample. In a preferred embodiment, the methods involve 

contacting a biological sample (e.g., an endothelial cell sample) with a compound or an 
agent capable of detecting EMI 1 protein or EMI 1 mRN A such that the presence of 
EMIl is detected in the biological sample. The compound or agent can be, for example, 
a labeled or labelable nucleic acid probe capable of hybridizing to EMIl mRN A or a 

35 labeled or labelable antibody capable of binding to EMIl protein. The invention further 
provides methods for diagnosis of a subject with, for example, a cardiovascular disease 
or a proliferative disorder, based on detection of EMI 1 protein or mRN A. In one 
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embodiment, the method involves contacting a cell or tissue sample (e.g., an epithelial 
cell or an endothelial cell sample) from the subject with an agent capable of detecting 
F^Ml I protein or mRNA, determining the amount of EMI I protein or mRNA expres.sed 
in the cell or ti.ssue sample, comparing the amount of EMI 1 protein or mRNA expressed 
5 in the cell or tissue sample to a control sample and forming a diagnosis based on the 
amount of EMI 1 protein or mRNA expressed in the cell or tissue sample as compared to 
the control sample. Preferably, the cell sample is an endothelial cell sample. Kits for 
detecting EMI 1 in a biological sample are also within the scope of the invention. 

Still another aspect of the invention pertains to methods, e.g., screening assays. 
0 for identifying a compound for treating a disorder characterized by aberrant EMI I 
nucleic acid expression or protein activity, e.g., a cardiovascular disorder or a 
proliferative disorder. These methods typically include assaying the ability of the 
compound or agent to modulate the expression of the EMU gene or the activity of the 
EM! 1 protein thereby identifying a compound for treating a disorder characterized by 
5 aberrant EMI 1 nucleic acid expression or protein activity. In a preferred embodiment, 
the method involves contacting a biological sample, e.g.. a cell or tissue sample, e.g.. an 
endothelial cell sample, obtained from a subject having the disorder with the compound 
or agent, determining the amount of EMU protein expressed and/or measuring the 
activity of the EMU protein in the biological sample, comparing the amount of EMU 
protein expressed in the biological sample and/or the measurable EMU biological 
activity in the cell to that of a control sample. An alteration in the amount of EMI 1 
protein expression or EMI I activity in the cell exposed to the compound or agent in 
comparison to the control is indicative of a modulation of EMU expression and/or EMIl 
activity. 

The invention also pertains to methods for identifying a compound or agent 
which interacts with (e.g., binds to) an EMI 1 protein. These methods can include the 
steps of contacting the EMI 1 protein with the compound or agent under conditions 
which allow binding of the compound to the EMI 1 protein to form a complex and 
detecting the formation of a complex of the EMI 1 protein and the compound in which 
the ability of the compound to bind to the EMU protein is indicated by the presence of 
the compound in the complex. 

The invention further pertains to methods for identifying a compound or agent 
which modulates, e.g.. stimulates or inhibits, the interaction of the EMIl protein with a 
target molecule, e.g., MADR6, MADR7, or a complex of MADR6 and MADR7. In 
these methods, the EMIl protein is contacted, in the presence of the compound or agent, 
with the target molecule under conditions which allow binding of the target molecule to 
the EMIl protein to form a complex. An alteration, e.g., an increase or decrease, in 
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complex Foimalion between the EMIl protein and the target molecule as compared to 
the amount of complex formed in the absence of the compound or agent is indicative of 
the ability of the compound or agent to modulate the interaction of the EMIl protein 
with a target molecule, 

5 

Brief Description of the Drawing 

Figure I depicts the EMI I nucleotide (SEQ ID NO: 1 ) and amino acid (SEQ ID 
NO:2) sequence. 

10 Detailed Description of the Invention 

The present invention is based on the discovery of novel molecules, referred to 
herein as EMIl nucleic acid and protein molecules, which play a role in or function in 
growth factor signaling pathways. In one embodiment, the EMIl molecules modulate 
the activity of one or more proteins involved in a growth factor signaling pathway, e.g., 

15 a TGF-P signaling pathway. In a preferred embodiment, the EMI 1 molecules of the 

present invention are capable of modulating the activity of proteins involved in the TGF- 
\\ signaling pathway to thereby modulate the effects of TGF-p on TGF-(i responsive 
cells. In a particularly preferred embodiment, the EMIl molecules are capable of 
modulating the activity of M ADR proteins, such as MADR6 (the fchd534 gene product) 

20 and MADR7 (the fchd540 gene product), in TGF-P responsive cells. As used herein, an 
"MADR protein" is a protein which is involved in the TGF-p signaling pathway and 1) 
which includes a domain of at least about 10 amino acid residues which is at least about 
40% or more homologous to a domain of the Drosophila MAD protein: or 2) which 
includes a PY domain (as defined herein). Examples of human MADR proteins include 

25 hMAD2-4, MADRl, MADR2, MADR6, and MADR7. Non-human MADR proteins 

include, for example, Sma2-4 (from C. elegans). Mad2 (from Xenopus\ and Drosophila 
MAD. Using Northern analysis, MADR6 has been found to be expressed in the 
following tissues: heart, placenta, lung, prostate, ovary, and small intestine; and MADR7 
has been found to be expressed in the following tissues: heart, brain, placenta, lung, 

30 liver, skeletal muscle, kidney, pancreas, spleen, thymus, prostate, testis, ovary, small 
intestine, colon, and peripheral blood leukocytes. As MADR proteins, e.g., MADR6 
and MADR7, are involved in the TGF-P signaling pathway in TGF-p responsive cells 
and the EMIl molecules of the invention modulate MADR activity, the EMIl molecules 
can modulate a cell's response to TGF-p. For example, MADR6 and MADR7 inhibit 

35 the beneficial effects (e.g., vascular injury reparatory effects) (Border et al. (1995) 

Nature Medicine 1:1000; Grainger et al. (1995) Nature Medicine 1:1067-1073; Nikol et 
al. (1992),/ Clin. Invest 90:1582-1592; Kqjima etal. (1991)./. Cell Biol. 1 13:1439- 
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1445)) of rOF-p on cndolhelial cells. Thus, the EMU protein, by interacting with (e.g., 
binding to) MADR6 and/or MADR7, can modulate (e.g., inJiibit) the TGF-p inhibitory 
effects of MADR6 and MADR7 in endothelial cells to thereby allow the cells to more 
readily receive the beneficial effects of TGF-p. Thus, EMU molecules (or modulators 
5 thercoO of the present invention can be used to treat various cardiovascular disorders 
such as atherosclerosis, ischeniia/reperfusion, hypertension, restenosis, and arterial 
inflammation. 

In another embodiment, the EMI 1 molecules of the invention are capable of 
modulating the activity of an MADR protein, such as MAORI, in epithelial cells. For 

0 example. MAORI mediates TGF-p tumor suppressor effects in epithelial cells, e.g., gut- 
derived epithelial cells. Thus, the EMI I protein, by interacting with (e.g., binding to) 
MAORI, can modulate (e.g.. stimulate or inhibit) the TGF-p tumor suppressor effects of 
MAORI in epithelial cells such as colorectal carcinoma cells to thereby inhibit further 
growth of the cells, e.g., render the cells more responsive to the tumor suppressor effects 

5 of TGF-p. Thus. EMI I molecules (or modulators thereof) of the present invention can 
also be used to treat various proliferative disorders, e.g., cancers, such as epithelial cell 
(e.g.. gut associated or gut derived epithelial cell) cancers. In addition, as the EMI 1 
molecules of the present invention can modulate a TGF-p response in a TGF-p 
responsive cell such as an endothelial cell, the EMI I molecules (or modulators thereof) 
0 can be used to modulate angiogenesis. e.g., pathological angiogenesis (e.g., tumor 

angiogenesis) and thus to treat disorders characterized by or associated with pathological 
angiogenesis. 

TGF-p is also capable of initiating various effects in a variety of different cell 
types. For example, TGF-p is an immune regulatory molecule which can act to both 
activate and suppress actions of leukocytes, T cells, and macrophages. Furthermore, 
administration of TGF-p in animal models of autoimmune diseases has been shown to 
ameliorate autoimmune diseases including experimental autoimmune encephalitis (a 
model of multiple sclerosis) and experimental arthiritis. Thus, molecules, such as the 
EMI I molecules (or modulators thereof) described herein, which are capable of 
modulating a TGF-p in a TGF-p responsive cell, can also modulate TGF-p responses in 
immune cells and thus be used to treat autoimmune diseases. In another example. TGF- 
p is known to act on connective tissue cells to modulate the production of extracellular 
matrix molecules. Overproduction of extracellular matrix molecules results in fibrotic 
disorders which can affect vital organs such as the kidney, liver, lung, and heart. Thus, 
modulation of TGF-p activity in connective tissue cells, by. for example, modulating 
EMU activity, is another approach to treating connective tissue disorders, e.g., fibrotic 
disorders. 
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In addition, abnormal production of TGF-P has been implicated in altered 
wound heating processes. I'or example, underproduction of TGF-(3 has been linked to 
impaired wound healing in some subjects, e.g., elderly subjects, subjects with diabetes. 
Thus, modulation of TGF-P activity in cells involved in wound healing, e.g., connective 
5 tissue cells by. for example, modulating EMIl activity, is one approach to modulating 
wound healing. 

EMIl nucleic acid molecules were identified from human breast tissue based on 
their ability, as determined using yeast two-hybrid assays (described in detail in 
Example I ), to interact with human MADR6 and MADR7 proteins. As described 

10 above, the human MADR6 and MADR7 proteins were previously identified based on 
their differential expression in an experimental paradigm of cardiovascular disease. See 
United States Serial Number 08/599,654, filed February 9, 1996, and United States 
Serial Number 08/799,910, filed February 13, 1997, the contents of which are expressly 
incorporated herein by reference. A plasmid containing the full length nucleotide 

15 sequence encoding MADR6 was deposited with the Agricultural Research Service 
Culture Collection (NRRL), Peoria, Illinois, on June 6, 1995 and assigned Accession 
Number B-21459. A plasmid containing the full length nucleotide sequence encoding 
MADR7 was deposited with the American Type Culture Collection (ATCC), Rockville, 
Maryland, on February 7, 1996 and assigned Accession Number 69984. 

20 Because of its ability to interact with (e.g., bind to) the MADR6 and MADR7 

proteins (and MADR proteins described in the Examples below) which are proteins 
involved in the TGF-P signaling pathway, the EMIl protein is also a protein which 
functions in the TGF-p signaling pathway. 

The nucleotide sequence of the isolated human EMIl cDNA and the predicted 

25 amino acid sequence of the human EMI I protein are shown in Figure 1 and in SEQ ID 
NOs: 1 and 2, respectively. A plasmid containing the full length nucleotide sequence 
encoding human EMIl (with the DNA insert name of EpFWAl 1 ) was deposited with 
ATCC on March 27, 1997 and assigned Accession Number 98375. This deposit will be 
maintained under the terms of the Budapest Treaty on the International Recognition of 

30 the Deposit of Microorganisms for the Purposes of Patent Procedure. This deposit was 
made merely as a convenience for those of skill in the art and is not an admission that a 
deposit is required under 35 U.S.C. §112. 

A GenBank^'^ search using the EMI I nucleotide sequence of SEQ ID NO: I 
revealed four ESTs, one human and three mouse, which were similar to different regions 

35 of the nucleotide sequence of SEQ ID NO:l. The human EST, ZC51D02 (Accession 
Number AA037190), is identical to a portion (nucleotides 1262 to 1290) of the 3* 
untranslated sequence of SEQ ID NO: I . The first mouse EST, MJ40E12.R1 (Accession 
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Number AA05 1 144). is approximately 88% homologous to nucleotides 39 to 529 of 
SEQ ID NO:l. The second mouse ES T, ME30A06.R1 (Accession Number W66992). i 
approximately 88% homologous to nucleotides 58 lo 416 of SEQ ID NO: I The third 
mouse EST, MA09E05.R1 (Accession Number W54933), is approximately 88% 
5 homologous to nucleotides 25 to 275 of SEQ 1 D NO: 1 . As no reading fraine can be 
determined from an EST (such as the an EST identified in the above database searches), 
an amino acid sequence encoded by the EST cannot be determined. 

GenPepf^^^ and SwissProt'^ database searches of the EMU amino acid sequence 
of SEQ ID NO:2 revealed three polypeptide sequences which include WW domains 
0 which are similar to SEQ ID NO:4. Two of these polypeptide sequences are derived 
from yeast and one of the polypeptide sequences is derived from humans. The first 
yeast polypeptide sequence, RSP5 (SwissProt^'^ Accession Number P39940), includes a 
domain which is approximately 70% homologous to SEQ ID NO:4. The second yeast 
polypeptide sequence, ubiquitin protein ligase (GenPepf^' Accession Number Y07592). 
5 includes a domain which is approximately 63% homologous to SEQ ID NO:4. The 
human polypeptide sequence. NEDD4 (SwissProt^M Accession Number T46934) 
includes a domain which is approximately 53% homologous to SEQ ID NO:4. GenPept 
™ and SwissProf '^ database searches of the amino acid sequence of SEQ ID NO:2 
(using a score of 50 and a word length of 3) revealed no a»Il length human protein 
sequence. Thus, the present invention aI.so pertains to proteins which have an amino 
acid sequence which is substantially homologous to the amino acid sequence of SEQ ID 
NO:2 (encoded by the open reading frame shown in SEQ ID NO:3) or an amino acid 
sequence encoded by the nucleotide sequence of the DNA insert of the plasmid 
deposited with ATCC as Accession Number 98375. As u.sed herein, a protein which has 
an amino acid sequence which is substantially homologous to a selected amino acid 
sequence is least about 50% homologous to the selected amino acid sequence, e.g., the 
entire selected amino acid sequence. A protein which has an amino acid sequence which 
is substantially homologous to a selected amino acid sequence can also be least about 
60-70%. preferably at least about 80-85%. and more preferably at least about 86, 88. 
90%, and most preferably at least about 90-95% or more homologous to .selected amino 
acid sequence. 

The human EMU gene, which is approximately 1290 nucleotides in length, 
encodes a full length protein having a molecular weight of approximately 21 kD and 
which is approximately 335 amino acid residues in length. The EMI 1 protein is 
expressed at least in endothelial cells and. as the nucleic acid encoding EMU protein 
was isolated from a human breast library. EMI I protein is also most likely expressed in 
cells, e.g., parenchymal cells (e.g., epithelial cells) and stromal (e.g.. connective tissue 
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cells) cells, found in breast tissue. The carboxy-terminal 36 amino acid residues (amino 
acid residues 300 to 335) comprise a WW domain (SEQ ID NO:4). As used herein, the 
tenn "WW domain" refers to a structural amino acid motif which includes about 30-40 
(typically 38) semiconserved amino acid residues two of which are conserved 
5 tryptophan (W) residues. A WW domain also preferably includes a high content of 

polar amino acid residues and the presence of prolines distributed preferentially towards 
both termini of the protein sequence (Sudot et al. (1995) FEBS Letters 369:67-71). The 
WW domain of EMI I comprises amino acids 300 to 335 of SEQ ID NO:2 (shown as 
SEQ ID NO:4) (which is encoded by nucleotides 974 to 1 08 1 of SEQ ID NO: 1 ) as 
10 follows (the proline and conserved tryptophan residues are in bold and underlined) : 

300 DALPAGWEQRELPNGRVYYVDHNTKTTTWERPLPPG 335 (SEQ ID NO:4) 

The consensus sequence bound by the WW domain of the EMI 1 protein 

15 comprises a PY motif (Chen and Sudol (1995) PNAS 92:7819-7823), As used herein, "a 
PY motif or P Y domain" is an amino acid sequence of at least about 4-5 amino acid 
residues which includes a proline-rich domain followed by a tyrosine residue. The 
particular PY motifs to which the WW domain of the EMIl protein bind include the 
following amino acid sequence: XPPXY wherein X can be any amino acid residue. 

20 Several MADR proteins, including, for example, MADRl, hMAD2-4, MADR6, 

MADR7, contain a PY motif. The PY motifs of several MADR proteins are set forth in 
Example 4 (Table 5) below. 

The EMIl protein or a biologically active portion or fragment of the invention 
can have one or more of the following activities: 1 ) it can interact with (e.g., bind to) an 

25 MADR protein; 2) it can modulate the activity of an MADR protein; 3) it can interact 
with (e.g., bind to) a protein having a PY motif: 4) it can modulate the activity of a 
protein having a PY motif; and 5) it can modulate a TGF-p response in a TGF-P 
responsive cell, e.g., an epithelial cell, an endothelial cell, to thereby beneficially affect 
the TGF-P responsive cell. 

30 Various aspects of the invention are described in further detail in the following 

subsections: 

1. Isolated Nucleic Acid Molecules 

One aspect of the invention pertains to isolated nucleic acid molecules that 
35 encode EMIl or biologically active portions thereof, as well as nucleic acid fragments 
sufficient for use as hybridi2:ation probes to identify EMI 1 -encoding nucleic acid (e.g., 
EMIl mRNA). As used herein, the term "nucleic acid molecule" is intended to include 
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DNA molecules (e.g., cDNA or genomic DNA) and RNA molecules (e.g., mRNA) and 
analogs of the DNA or RNA generated using nucleotide analogs. The nucleic acid 
molecule can be single-stranded or double-stranded, but preferably is double-stranded 
DNA. An "isolated" nucleic acid molecule is one which is separated from other nucleic 
5 acid molecules which arc present in the natural source of the nucleic acid. Preferably, an 
"isolated" nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., 
sequences located at the 5' and 3' ends of the nucleic acid) in the genomic DNA of the 
organism from which the nucleic acid is derived. For example, in various embodiments, 
the i.solated EMIl nucleic acid molecule can contain less than about 5 kb, 4kb. 3kb, 2kb, 
0 I kb. 0.5 kb or 0. 1 kb of nucleotide sequences which naturally flank the nucleic acid 
molecule in genomic DNA of the cell from which the nucleic acid is derived (e.g., an 
endothelial cell). Moreover, an "isolated" nucleic acid molecule, such as a cDNA 
molecule, can be substantially free of other cellular material, or culture medium when 
produced by recombinant techniques, or chemical precursors or other chemicals when 
5 chemically synthesized. 

A nucleic acid molecule of the present invention, e.g., a nucleic acid molecule 
having the nucleotide sequence of SEQ ID NO: I , or a portion thereof, can be isolated 
using standard molecular biology techniques and the sequence information provided 
herein. For example, a human EMIl cDNA can be isolated from a human breast library 
3 using all or portion of SEQ ID NO: 1 as a hybridization probe and standard hybridization 
techniques (e.g.. as described in Sambrook. J.. Fritsh. E. F., and Maniatis, T. Molecular 
Clonins: A Laboratory Manual. 2nd ed.. Cold Spring Harbor Laboratory, Cold Spring 
Harbor Laboratory Press. Cold Spring Harbor, NY, 1 989). Moreover, a nucleic acid 
molecule encompassing all or a portion of SEQ ID NO: 1 can be isolated by the 
! polymerase chain reaction using oligonucleotide primers designed based upon the 
sequence of SEQ ID NO:l . For example. mRNA can be isolated from normal 
endothelial cells (e.g., by the guanidinium-thiocyanate extraction procedure of Chirgwin 
et al. (1979) Biochemistry 18: 5294-5299) and cDNA can be prepared using reverse 
transcriptase (e.g.. Moloney MLV reverse transcripta.se. available from Gibco/BRL. 
Bethesda, MD; or AMV reverse transcriptase, available from Seikagaku America, Inc., 
St. Petersburg, FL). Synthetic oligonucleotide primers tor PCR amplification can be 
designed based upon the nucleotide sequence shown in SEQ ID NO:l. A nucleic acid of 
the invention can be amplified using cDNA or, alternatively, genomic DNA, as a 
template and appropriate oligonucleotide primers according to standard PCR 
amplification techniques. The nucleic acid .so amplified can be cloned into an 
appropriate vector and characterized by DNA sequence analysis. Furthermore, 
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oligonucleotides corresponding to a EMI 1 nucleotide sequence can be prepared by 
standard synthetic tecliniques, e.g., using an automated DNA synthesizer. 

In a preferred embodiment, an isolated nucleic acid molecule of the invention 
comprises the nucleotide sequence shown in SEQ ID NO: I or the nucleotide sequence of 
5 the DNA insert of the plasmid deposited with ATCC as Accession Number 98375. The 
sequence of SEQ ID NO: 1 corresponds to the human EMI 1 cDNA. This cDNA 
comprises sequences encoding the EMIl protein (i.e., "the coding region", from 
nucleotides 77 to 1081 ), as well as 5' untranslated sequences (nucleotides 1 to 76) and 3' 
untranslated sequences (nucleotides 1082 to 1290). Alternatively, the nucleic acid 
10 molecule can comprise only the coding region of SEQ ID NO: 1 (e.g.. nucleotides 77 to 
1081). 

In another preferred embodiment, an isolated nucleic acid molecule of the 
invention comprises a nucleic acid molecule which is a complement of the nucleotide 
sequence shown in SEQ ID NO:L the nucleotide sequence of the DNA insert of the 

15 plasmid deposited with ATCC as Accession Number 98375, or a portion of either of 
these nucleotide; sequences. A nucleic acid molecule which is complementary to the 
nucleotide sequence shown in SEQ ID NO:l is one which is sufficiently complementary 
to the nucleotide sequence shown in SEQ ID NO: 1 such that it can hybridize to the 
nucleotide sequence shown in SEQ ID NO:l, thereby forming a stable duplex. 

20 In still another preferred embodiment, an isolated nucleic acid molecule of the 

invention comprises a nucleotide sequence which is at least about 60-65%, preferably at 
least about 70-75%, more preferably at least about 80-85%, and even more preferably at 
least about 90-95% or more homologous to the nucleotide sequence shown in SEQ ID 
NO:K the nucleotide sequence of the DNA insert of the plasmid deposited with ATCC 

25 as Accession Number 98375, or a portion of either of these nucleotide sequences. In an 
additional preferred embodiment, an isolated nucleic acid molecule of the invention 
comprises a nucleotide sequence which hybridizes, e.g., hybridizes under stringent 
conditions, to the nucleotide sequence shown in SEQ ID NO: 1, the nucleotide sequence 
of the DNA insert of the plasmid deposited with ATCC as Accession Number 98375, or 

30 a portion of either of these nucleotide sequences. 

Moreover, the nucleic acid molecule of the invention can comprise only a portion 
of the coding region of SEQ ID NO: 1 . for example a fragment which can be used as a 
probe or primer or a fragment encoding a biologically active portion of EMI 1 . The 
nucleotide sequence determined from the cloning of the EMIl gene from a mammal 

35 allows for the generation of probes and primers designed for use in identifying and/or 
cloning EMIl homologues in other cell types, e.g. from other tissues, as well as EMI 1 
homologues from other mammals. The probe/primer typically comprises substantially 
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purified oligonucleotide. The oligonucleotide typically comprises a region of nucleotide 
sequence that hybridizes under stringent conditions to at least about 12, preferably about 
25, more preferably about 40. 50 or 75 consecutive nucleotides of SEQ ID NO: I sense, 
an anti-sense sequence of SEQ ID NO: 1, or naturally occurring mutants thereof. 
Primers based on the nucleotide sequence in SEQ ID NO: 1 can be used in PCR reactions 
to clone EMU homologues. Probes based on the EMU nucleotide sequences can be 
used to detect transcripts or genomic sequences encoding the same or homologous 
proteins. In preferred embodiments, the probe further comprises a label group attached 
thereto, e.g. the label group can be a radioisotope, a nuorescent compound, an enzyme, 
or an enzyme co-factor. Such probes can be used as a part of a diagnostic test kit for 
identifying cells or tissue which mise.xpress an EMU protein, such as by measuring a 
level of an EMI 1 -encoding nucleic acid in a sample of cells from a subject e.g., 
detecting EMI 1 mRNA levels or determining whether a genomic EMI 1 gene has been 
mutated or deleted. 

1 5 In one embodiment, the nucleic acid molecule of the invention encodes a protein 

or portion thereof which includes an amino acid sequence which is sufficiently 
homologous to an amino acid sequence of SEQ ID NO:2 or an amino acid sequence 
encoded by the nucleotide sequence of the DNA insert of the plasmid deposited with 
ATCC as Accession Number 98375 such that the protein or portion thereof maintains 
20 the ability to modulate a TGF-(i response in a TGF-p responsive cell. As used herein, 
the language "sufficiently homologous" refers to proteins or portions thereof which have 
amino acid sequences which include a minimum number of identical or equivalent (e.g., 
an amino acid residue which has a similar side chain as an amino acid residue in SEQ ID 
NO:2) amino acid residues to an amino acid sequence of SEQ ID NO:2 or an amino acid 
25 sequence encoded by the nucleotide sequence of the DNA insert of the plasmid 

deposited with ATCC as Accession Number 98375 such that the protein or portion 
thereof is able to modulate a TGF-fi response in a TGF-p responsive cell. Members of 
the TGF-P family of proteins, as described herein, initiate a variety of responses in many 
different cells types. Examples of such responses are also described herein. Thus, a 
30 "TGF-p response in a TGF-p responsive cell" is a cellular response to a member of the 
TGF-P family of proteins. Non-limiting examples of the subfamilies included in the 
TGF-P family of proteins include members of the TGF-p subfamily, which comprises at 
lea.st four genes that are much more similar to TGFP-1 than to other members of the 
TGFp family of proteins; the activin subfamily, comprising homo- or hetero-dimers or 
35 two sub-units. inhibinP-A and inhibinP-B: the decapentaplegic (DPP) subfamily, 

including the mammalian factors BMP2 and BMP4, which can induce the formation of 
ectopic bone and cartilage when implanted under the skin or into muscles; and the 60A 
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subfamily, which includes a number of mammahan homologucs, e.g., BMP5-8, with 
osteoinductive activity. Other members of the I GFji family of proteins include gross 
differentiation factor 1 (GDF-1), GDF-3/VGR-2, dorsalin, nodal, muUerian-inhibiting 
substance (MIS), and glial-derived neurotrophic growth factor (GDNF). The DPP and 
5 60A subfamilies are related more closely to one another than to other members of the 
TGFp superfamily, and have often been grouped together as part of a larger collection of 
molecules called DVR (dpp and vgl related). In another embodiment, the protein is at 
least about 60-70%, preferably at least about 80-85%, and more preferably at least about 
86, 88, 90%, and most preferably at least about 90-95% or more homologous to the 
10 entire amino acid sequence of SEQ ID NO:2. 

Portions of proteins encoded by the EMU nucleic acid molecule of the invention 
are preferably biologically active portions of the EMU protein. As used herein, the term 
"biologically active portion of EMU" is intended to include a portion, e.g., a 
domain/motif, of EMI 1 that has one or more of the following activities: 1 ) it can 
1 5 interact with (e.g., bind to) an MADR protein; 2) it can modulate the activity of an 

MADR protein; 3) it can interact with (e.g., bind to) a protein having a PY motif; 4) it 
can modulate the activity of a protein having a PY motif; and 5) it can modulate a TGF- 
P response in a TGF-P responsive cell, e.g., an epithelial celL an endothelial cell, to, for 
example, beneficially affect the TGF-P responsive cell. Standard binding assays, e.g., 
20 immunoprecipitations and yeast two-hybrid assays as described herein, can be 

performed to determine the ability of an EMll protein or a biologically active portion 
thereof to interact with (e.g., bind to) an MADR protein or a protein having a PY motif 
To determine whether an EMI 1 protein or a biologically active portion thereof can 
modulate TGF-p response in a TGF-P responsive cell such as an endothelial celL 
25 endothelial cells e.g., bovine aortic endothelial cells, can be transfected with a TGF-P 
responsive reporter construct, e.g., p3TP-Lux (Wrana et al. (1994) Nature 370:341-347) 
which responds to TGF-P signaling by expressing luciferase, and a nucleic acid 
encoding the EMI I protein or biologically active portion thereof The endothelial cells 
can then be exposed to TGF-p. After exposure of the cells to TGF-P, the cells can be 
30 harvested and lysed and reporter activity, e.g., luciferase activity, can be measured and 
compared to control reporter activity. The ability of an EMI 1 protein or a biologically 
active portion thereof to modulate an MADR protein activity can be determined using an 
assay similar to the assay described above for dctemiining the ability of an EMIl protein 
or a biologically active portion thereof to modulate TGF-P response in TGF-P 
35 responsive cells. In particular, endothelial cells, e.g., bovine aortic endothelial cells, can 
be transfected with a TGF-P responsive reporter construct, e.g., p3TP-Lux (Wrana et al. 
( 1994) Nature 370:341-347) which responds to FGF-P signaling by expressing 
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luciferasc. and an expression vector which expresses an MADR protein (e.g.. pCI 
expression vectors (Promega. Madison, WI) which express MADR6 and/or MADR7), 
PCMV5MADR1-Plag (Hoodless et al. ( 1996) Cell 85:489-.500). or PCMV5MADR2- 
Flag CEppert et al. (1996) Cell 86:543-552). The endothelial cells can then be exposed 
5 to TGF-p. After exposure of the cells to TGF-p. the cells can be harvested and lysed 
and reporter activity, e.g., luciferase activity, can be measured and compared to reporter 
activity in endothelial cells which also include nucleic acid encoding the EMI I protein 
or biologically active portion thereof. An alteration in reporter activity in the endothelial 
cells which include nucleic acid encoding the EMU protein as compared to reporter 
10 activity in the endothelial cells without nucleic acid encoding the EMI 1 protein is 
indicative of a modulation of a TGF-p response in the TGF-p responsive cell. 

In one embodiment, the biologically active portion of EMU comprises a WW 
domain. Preferably, the WW domain is encoded by a nucleic acid molecule derived 
from a human and is at least about 55%. preferably at least about 60-65%, even more 
1 5 preferably at least about 70-75%, and most preferably at least about 80-90% or more 
homologous to SEQ ID NO:4. If the WW domain is encoded by a nonmammalian 
nucleic acid, it is preferably at least about 75%. preferably at least about 80-85%. most 
preferably at least about 90-95% or more homologous to SEQ ID NO:4. In a preferred 
embodiment, the biologically active portion of the protein which includes the WW 
20 domain can modulate the activity of a protein having a PY motif and/or modulate a 
TGF-p response in a TGF-p responsive cell, e.g., an endothelial cell, to thereby 
beneficially affect the TGF-p responsive cell. In a preferred embodiment, the 
biologically active portion comprises the WW domain of EMI 1 as represented by amino 
acid residues 300 to 335 of SEQ ID NO:2 and as SEQ ID NO:4. Additional nucleic acid 
25 fragments encoding biologically active portions of EMI 1 can be prepared by isolating a 
portion of SEQ ID NO: 1 , expressing the encoded portion of EMI 1 protein or peptide 
(e.g., by recombinant expression in viiro ) and assessing the activity of the encoded 
portion of EMI 1 protein or peptide. 

The invention further encompasses nucleic acid molecules that differ from the 
30 nucleotide sequence shown in SEQ ID NO: I (and portions thereoO due to degeneracy of 
the genetic code and thus encode the same EMI 1 protein as that encoded by the 
nucleotide sequence shown in SEQ ID NO: I . In another embodiment, an isolated 
nucleic acid molecule of the invention has a nucleotide sequence encoding a protein 
having an amino acid sequence shown in SEQ ID NO:2 or a protein having an amino 
35 acid sequence encoded by the nucleotide sequence of the DNA insert of the plasmid 
deposited with ATCC as Accession Number 98375. In a still further embodiment, the 
nucleic acid molecule of the invention encodes a full length human protein which is 



1 
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substantially homologous to the amino acid sequence of SEQ ID NO:2 (encoded by the 
open reading frame shown in SEQ ID NO: 3) or an amino acid sequence encoded by the 
nucleotide sequence of the DNA insert of the plasmid deposited with ATCC as 
Accession Number 98375. 
5 hi addition to the human EMIl nucleotide sequence shown in SEQ ID NO: L it 

will be appreciated by those skilled in the art that DNA sequence polymorphisms that 
lead to changes in the amino acid sequences of EMU may exist within a population 
(e.g., the human population). Such genetic polymorphism in the EMI 1 gene may exist 
among individuals within a population due to natural allelic variation. As used herein, 

10 the temis "gene" and "recombinant gene" refer to nucleic acid molecules comprising an 
open reading frame encoding an EMI 1 protein, preferably a mammalian EMI 1 protein. 
Such natural allelic variations can typically result in 1-5% variance in the nucleotide 
sequence of the EMI I gene. Any and all such nucleotide variations and resulting amino 
acid polymorphisms in EMIl that are the result of natural allelic variation and that do 

15 not alter the functional activity of EMIl are intended to be within the scope of the 
invention. Moreover, nucleic acid molecules encoding EMIl proteins from other 
species, and thus which have a nucleotide sequence which differs from the human 
sequence of SEQ ID NO:l, are intended to be within the scope of the invention. Nucleic 
acid molecules corresponding to natural allelic variants and nonhuman homologucs of 

20 the human EMI 1 cDNA of the invention can be isolated based on their homology to the 
human EMIl nucleic acid disclosed herein using the human cDNA, or a portion thereof, 
as a hybridization probe according to standard hybridization techniques under stringent 
hybridization conditions. Accordingly, in another embodiment, an isolated nucleic acid 
molecule of the invention is at least 15 nucleotides in length and hybridizes under 

25 stringent conditions to the nucleic acid molecule comprising the nucleotide sequence of 
SEQ ID NO:l or the nucleotide sequence of the DNA insert of the plasmid deposited 
with ATCC as Accession Number 98375. In other embodiments, the nucleic acid is at 
least 30, 50, 100, 250 or 500 nucleotides in length. As used herein, the term "hybridizes 
under stringent conditions" is intended to describe conditions for hybridization and 

30 washing under which nucleotide sequences at least 60% homologous to each other 
typically remain hybridized to each other. Preferably, the conditions are such that 
sequences at least about 65%, more preferably at least about 70%, and even more 
preferably at least about 75% or more homologous to each other typically remain 
hybridized to each other. Such stringent conditions are known to those skilled in the art 

35 and can be found in Current Protocols in Molecular Biology^ John Wiley & Sons, N. Y. 
( 1 989), 6.3. 1 -6.3.6. A preferred, non-limiting example of stringent hybridization 
conditions are hybridization in 6X sodium chloride/sodium citrate (SSC) at about 45®C, 
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followed by one or more washes in 0.2 X SSC. 0. 1% SDS at 50-65°C. Preferably, an 
isolated nucleic acid molecule of the invention that hybridizes under stringent conditions 
to the sequence of SEQ ID NO: I corresponds to a naturally-occurring nucleic acid 
molecule. As u.sed herein, a "naturally-occurring" nucleic acid molecule refers to an 
5 RNA or DNA molecule having a nucleotide sequence that occurs in nature (e.g.. encodes 
a natural protein). In one embodiment, the nucleic acid encodes a natural human EMI 1 
In addition to naturally-occurring allelic variants of the EMU sequence that may 
e.xisi in the population, the skilled artisan will further appreciate that changes can be 
introduced by mutation into the nucleotide .sequence of SEQ ID NO: I, thereby leading 
1 0 to changes in the amino acid sequence of the encoded EMI 1 protein, without altering the 
functional ability of the EMI I protein. For example, nucleotide substitutions leading to 
amino acid substitutions at "non-essential" amino acid residues can be made in the 
sequence of SEQ ID NO: 1. A "non-essential" amino acid residue is a residue that can be 
altered from the wild-type sequence of EMU (e.g., the sequence of SEQ ID NO:2) 
1 5 without altering the activity of EMI 1 . whereas an "essential" amino acid residue is 
required for EMU activity. For example, conserved amino acid residues, e.g., 
tryptophans and prolines, in the WW domain of EMI I are most likely important for 
binding to MADR proteins and are thus essential residues of EMI 1 . Other amino acid 
residues, however, (e.g., those that are not conserved or only semi-conserved in the WW 
20 domain) may not be essential for activity and thus are likely to be amenable to alteration 
without altering EMU activity. 

Accordingly, another aspect of the invention pertains to nucleic acid molecules 
encoding EMI 1 proteins that contain changes in amino acid residues that are not 
essential for EMU activity. Such EMU proteins differ in amino acid sequence from 
25 SEQ ID NO:2 yet retain at least one of the EMI 1 activities described herein. In one 
embodiment, the isolated nucleic acid molecule comprises a nucleotide .sequence 
encoding a protein, wherein the protein comprises an amino acid sequence at least about 
60% homologous to the amino acid sequence of SEQ ID NO:2 and is capable of 
modulating a TGF-P re.spon.se in a IGF-fi responsive cell. Preferably, the protein 
30 encoded by the nucleic acid molecule is at least about 70% homologous to SEQ ID 

NO:2, more preferably at least about 80-85% homologous to SEQ ID NO:2, even more 
preferably at least about 90% homologous to SEQ ID NO:2, and most preferably at least 
about 95-99% homologous to SEQ ID NO:2. 

To determine the percent homology of two amino acid sequences (e.g., SEQ ID 
35 NO:2 and a mutant form thereof) or of two nucleic acids, the sequences are aligned for 
optimal comparison purpo.ses (e.g., gaps can be introduced in the sequence of one 
protein or nucleic acid for optimal alignment with the other protein or nucleic acid). The 
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amino acid residues or nucleotides at corresponding amino acid positions or nucleotide 
positions are then compared. When a position in one sequence (e.g., SEQ ID NO:2) is 
occupied by the same amino acid residue or nucleotide as the corresponding position in 
the other sequence (e.g., a mutant form of EMI 1), then the molecules are homologous at 
5 that position (i.e., as used herein amino acid or nucleic acid "homology" is equivalent to 
amino acid or nucleic acid "identity"). The percent homology between the two 
sequences is a function of the number of identical positions shared by the sequences 
(i.e., % homology = # of identical positions/total # of positions x 100). 

An isolated nucleic acid molecule encoding an EMIl protein homologous to the 
10 protein of SEQ ID NO:2 can be created by introducing one or more nucleotide 

substitutions, additions or deletions into the nucleotide sequence of SEQ ID NO: 1 such 
that one or more amino acid substitutions, additions or deletions are introduced into the 
encoded protein. Mutations can be introduced into SEQ ID NO: I by standard 
techniques, such as site-directed mutagenesis and PCR-mediated mutagenesis. 

1 5 Preferably, conservative amino acid substitutions are made at one or more predicted 
non-essential amino acid residues. A "conservative amino acid substitution" is one in 
which the amino acid residue is replaced with an amino acid residue having a similar 
side chain. Families of amino acid residues having similar side chains have been 
defined in the art. These families include amino acids with basic side chains (e.g., 

20 lysine, arginine, histidinc), acidic side chains (e.g., aspartic acid, glutamic acid), 
uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, 
tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, 
proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., 
threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, 

25 tryptophan, histidine). Thus, a predicted nonessential amino acid residue in EMIl is 
preferably replaced with another amino acid residue from the same side chain family. 
Alternatively, in another embodiment, mutations can be introduced randomly along all 
or part of an EMI 1 coding sequence, such as by saturation mutagenesis, and the resultant 
mutants can be screened for an EMIl activity described herein to identify mutants that 

30 retain EMIl activity. Following mutagenesis of SEQ ID NO:K the encoded protein can 
be expressed recombinantly (e.g., as described in Examples 2 and 3) and the activity of 
the protein can be determined using, for example, assays described herein. 

In addition to the nucleic acid molecules encoding EMIl proteins described 
above, another aspect of the invention pertains to isolated nucleic acid molecules which 

35 are antisense thereto. An "antisense" nucleic acid comprises a nucleotide sequence 
which is complementary to a "sense" nucleic acid encoding a protein, e.g., 
complementary to the coding strand of a double-stranded cDNA molecule or 
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complementary to an niRNA sequence. Accordingly, an anlisense nucleic acid can 
hydrogen bond to a sense nucleic acid. 1 he antiscnso nucleic acid can be 
complementary to an entire EMI 1 coding strand, or to only a portion thereof. In one 
embodiment, an antisense nucleic acid molecule is antisense to a "coding region" of the 
5 coding strand of a nucleotide sequence encoding EMI 1 . I'he term "coding region" refers 
to the region of the nucleotide .sequence comprising codons which are translated into 
amino acid residues (e.g.. the entire coding region of SEQ ID NO: 1 comprises 
nucleotides 77 to 108 1 ). In another embodiment, the anti.sense nucleic acid molecule is 
antisense to a "noncoding region" of the coding strand of a nucleotide sequence 
0 encoding EMI 1 . The tenn "noncoding region" refers to 5' and 3' sequences which Hank 
the coding region that are not translated into amino acids (i.e.. also referred to as 5' and 
3' untranslated regions). 

Given the coding strand sequences encoding EMU disclosed herein (e.g., SEQ 
ID NO: I ). antisense nucleic acids of the invention can be designed according to the rules 
5 of Watson and Crick base pairing. The antisense nucleic acid molecule can be 

complementary to the entire coding region of EMI 1 mRNA. but more pretcrably is an 
oligonucleotide which is antisense to only a portion of the coding or noncoding region of 
EMU mRNA. For example, the antisense oligonucleotide can be complementary to the 
region surrounding the translation start site of EMI I mRNA. An antisense 
oligonucleotide can be, for example, about 5. 10. 15, 20, 25. 30. 35, 40, 45 or 50 
nucleotides in length. An antisense nucleic acid of the invention can be constructed 
using chemical synthesis and enzymatic ligation reactions using procedures known in 
the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase 
the physical .stability of the duplex formed between the antisense and sense nucleic 
acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides can be 
used. Examples of modified nucleotides which can be used to generate the antisense 
nucleic acid include 5-fluorouracil. 5-bromouracil, 5-chlorouracil. 5-iodouracil, 
hypoxanthine. xanthine, 4-acetylcytosine. 5-(carboxyhydroxyImethyl) uracil, 5- 
carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil. 
dihydrouracil. beta-D-galactosylqueosine. inosine. N6-isopentenyladenine, 1- 
methylguanine, I -methylinosine. 2,2-dimethylguanine, 2-methyladenine. 2- 
methylguanine. 3-methylcytosine. 5-methylcytosine, N6-adenine, 7-methylguanine, 5- 
methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouraciI. beta-D- 
mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuraciI, 2-methylthio-N6- 
isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine. pseudouracil, queosine. 



wo 98/45467 



PCT/US98/07356 



2-lhiocytosine, 5-methyl-2-thiouracii, 2-thioLiracil, 4-thiouraciI, 5-mcthyluracil, uracil-5- 
oxyacetic acid methylestcr, uraciNS-oxyacetic acid (v), 5-methyl-2-thiouracil, 
amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2/)-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into 
5 which a nucleic acid has been subcloned in an anlisense orientation (i.e., RNA 

transcribed from the inserted nucleic acid will be of an antisense orientation to a target 
nucleic acid of interest, described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered 
to a subject or generated in sitt4 such that they hybridize with or bind to cellular mRNA 

10 and/or genomic DNA encoding an EMI 1 protein to thereby inhibit expression of the 
protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the 
case of an antisense nucleic acid molecule which binds to DNA duplexes, through 
specific interactions in the major groove of the double helix. An example of a route of 

15 administration of an antisense nucleic acid molecule of the invention includes direct 
injection at a tissue site. Alternatively, an antisense nucleic acid molecule can be 
modified to target selected cells and then administered systemically. For example, for 
systemic administration, an antisense molecule can be modified such that it specifically 
binds to a receptor or an antigen expressed on a selected cell surface, e.g., by linking the 

20 antisense nucleic acid molecule to a peptide or an antibody which binds to a cell surface 
receptor or antigen. The antisense nucleic acid molecule can also be delivered to cells 
using the vectors described herein. To achieve sufficient intracellular concentrations of 
the antisense molecules, vector constructs in which the antisense nucleic acid molecule 
is placed under the control of a strong pol II or pol III promoter are preferred. 

25 In yet another embodiment, the antisense nucleic acid molecule of the invention 

is an a-anomeric nucleic acid molecule. An <x-anomeric nucleic acid molecule fomis 
specific double-stranded hybrids with complementary RNA in which, contrary to the 
usual p-units, the strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids. 
Res. 1 5:6625-664 1 ). The antisense nucleic acid molecule can also comprise a 2'-o- 

30 methylribonucleotide (Inoue el al. (1987) Nucleic Acids Res. 15:6131-6148) or a 
chimeric RNA-DNA analogue (Inoue et al. (1987) FEDS Lett. 215:327-330). 

In still another embodiment, an antisense nucleic acid of the invention is a 
ribozyme. Ribozymes are catalytic RNA molecules with ribonuclease activity which arc 
capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which they 

35 have a complementary region. Thus, ribozymes (e.g.. hammerhead ribozymes 
(described in Haselhoff and Gerlach (1988) Nature 334:585-591 )) can be used to 
catalytically cleave EMI 1 inRNA transcripts to thereby inhibit translation of EMI 1 
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niRNA. A ribozyme having specificity for an EMI 1 -encoding nucleic acid can be 
designed based upon the nucleotide sequence ol an EMU cDNA disclosed herein (i.e.. 
SEQ ID NO: I ). For example, a derivative of a Tetrahymena E-19 IVS RNA can be 
constructed in which the nucleotide sequence of the active site is complementary to the 
5 nucleotide sequence to be cleaved in an EMU -encoding mRNA. See. e.g., Cech et al. 
U.S. Patent No. 4.987,071 and Cech et al. U.S. Patent No. 5.1 16,742. Alternatively. 
EMI 1 mRNA can be used to select a catalytic RNA having a specific ribonuclease 
activity from a pool of RNA molecules. See, e.g., Bartel, D. and Szostak, J.W. (1993) 
Science 261 : 141 1-141 8. 

1 0 Alternatively. EMI I gene expression can be inhibited by targeting nucleotide 

sequences complementary to the regulatory region of the EMU (e.g., the EMU promoter 
and/or enhancers ) to form triple helical structures that prevent transcription of the EMU 
gene in target cells. See generally. Helene, C. {\<)9{ ) Anticancer Dntfr Des. 6(6):569- 
84; Helene. C. et al. (1992) Ann. N. Y. Acad Sci. 660:27-36; and Maher. L.J. (1992) 

15 Bioussays 14(I2):807-I5. 



II. Recombinant Expression Vectors and Host Cells 

Another aspect of the invention pertains to vectors, preferably expression 
vectors, containing a nucleic acid encoding EMI 1 (or a portion thereof). As used herein. 

20 the term "vector" refers to a nucleic acid molecule capable of transporting another 

nucleic acid to which it has been linked. One type of vector is a "plasmid", which refers 
to a circular double stranded DNA loop into which additional DNA segments can be 
ligated. Another type of vector is a viral vector, wherein additional DNA segments can 
be ligated into the viral genome. Certain vectors are capable of autonomous replication 

25 in a host cell into which they are introduced (e.g.. bacterial vectors having a bacterial 
origin of replication and episomal mammalian vectors). Other vectors (e.g., non- 
episoma! mammalian vectors) are integrated into the genome of a host cell upon 
introduction into the host cell, and thereby are replicated along with the host genome. 
Moreover, certain vectors arc capable of directing the expression of genes to which they 
30 arc operatively linked. Such vectors are referred to herein as "expression vectors". In 
general, expression vectors of utility in recombinant DNA techniques are often in the 
form of plasmids. In the present specification, "plasmid" and "vector" can be used 
interchangeably as the plasmid is the most commonly used fomi of vector. However, 
the invention is intended to include such other forms of expression vectors, such as viral 
35 vectors (e.g.. replication defective retroviruses, adenoviruses and adeno-associated 
viruses), which serve equivalent functions. 



wo 98/45467 



PCT/US98/07356 



-25 - 

The recombinant expression vectors of the invention comprise a nucleic acid of 
the invention in a fomi suitable for expression of the nucleic acid in a host cell, which 
means that the recombinant expression vectors include one or more regulatory 
sequences, selected on the basis of the host cells to be used for expression, which is 
5 operatively linked to the nucleic acid sequence to be expressed. Within a recombinant 
expression vector, "operably linked" is intended to mean that the nucleotide sequence of 
interest is linked to the regulatory sequence(s) in a manner which allows for expression 
of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a 
host cell when the vector is introduced into the host cell). The term "regulatory 

10 sequence" is intended to includes promoters, enhancers and other expression control 
elements (e.g., polyadenylation signals). Such regulatory sequences are described, for 
example, in Goeddel; Gene Expression Technolof^*: Methods in Enzyniology 185, 
Academic Press, San Diego. CA (1990). Regulatory sequences include those which 
direct constitutive expression of a nucleotide sequence in many types of host cell and 

15 those which direct expression of the nucleotide sequence only in certain host cells (e.g., 
tissue-specific regulatory sequences). It will be appreciated by those skilled in the art 
that the design of the expression vector can depend on such factors as the choice of the 
host cell to be transformed, the level of expression of protein desired, etc. The 
expression vectors of the invention can be introduced into host cells to thereby produce 

20 proteins or peptides, including fusion proteins or peptides, encoded by nucleic acids as 
described herein (e.g., EMU proteins, mutant forms of EMIl, fusion proteins, etc.). 

The recombinant expression vectors of the invention can be designed for 
expression of EMI 1 in prokaryotic or eukaryotic cells. For example, EMI 1 can be 
expressed in bacterial cells such as E. co//,, insect cells (using baculovirus expression 

25 vectors) yeast cells or mammalian cells. Suitable host cells are discussed further in 

Goeddel, Gene Expression Technology*. Methods: in Enzymologyf 185, Academic Press, 
San Diego, CA ( 1 990). Alternatively, the recombinant expression vector can be 
transcribed and translated in vitro, for example using T7 promoter regulatory sequences 
and T7 polymerase. 

30 Expression of proteins in prokaryotes is most often carried out in E. coli with 

vectors containing constitutive or inducible promoters directing the expression of either 
fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a protein 
encoded therein, usually to the amino terminus of the recombinant protein. Such fusion 
vectors typically serve three purposes: 1 ) to increase expression of recombinant protein; 

35 2) to increase the solubility of the recombinant protein; and 3) to aid in the purification 
of the recombinant protein by acting as a ligand in affinity purification. Often, in fusion 
expression vectors, a proteolytic cleavage site is introduced at the junction of the fusion 
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moiety and the recombinant protein to enable separation of the recombinant protein from 
the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and 
their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. 
Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc: Smith. D.B. 
and Johnson, K.S. ( 1 988) Gene 67:3 I -40), pMAL (New England Biolabs, Beverly. MA) 
and pRlT5 (Pharmacia, Piscataway. NJ) which fuse glutathione S-transferase (GST), 
maltose E binding protein, or protein A, respectively, to the target recombinant protein. 
In one embodiment, the coding sequence of the EMU is cloned into a pGEX expression 
vector to create a vector encoding a fusion protein comprising, from the N-terminus to 
the C-tcrminus, GST-thrombin cleavage site-EMIl. The fusion protein can be purified 
by affinity chromatography using glutathione-agarose resin. Recombinant EMU 
unfused to GST can be recovered by cleavage of the fusion protein with thrombin. 

Examples of suitable inducible non-fusion E. coli expression vectors include 
pTrc (Amann et al.. (1988) Gene 69:301-3 15) and pET 1 Id (Studier ct al.. Gene 
Expression Technology.'. Methods in Enzymology 185, Academic Press, San Diego, 
California (1990) 60-89). Target gene expression from the pTrc vector relies on host 
RNA polymerase transcription from a hybrid trp-lac fusion promoter. Target gene 
expression from the pET 1 I d vector relies on transcription from a T7 gnlO-lac fusion 
promoter mediated by a coexpressed viral RNA polymerase (T7 gnl). This viral 
polymerase is supplied by host strains BL2 1(DE3) or HMSl 74(DE3) from a resident X 
prophage harboring a T7 gnl gene under the transcriptional control of the lacUV 5 
promoter. 

One strategy to maximize recombinant protein expression in E. coli is to express 
the protein in a host bacteria with an impaired capacity to proteolytically cleave the 
recombinant protein (Gottesman, S., Gene Expression Technology: Methods in 
Enzymology 185, Academic Press, San Diego, Califomia (1990) 1 19-128). Another 
strategy is to alter the nucleic acid sequence of the nucleic acid to be inserted into an 
expression vector .so that the individual codons for each amino acid are those 
preferentially utilized in E. coli ( Wada et al. ( 1 992) Nucleic Acids Res. 20:2 111-2118). 
Such alteration of nucleic acid sequences of the invention can be carried out by standard 
DNA synthesis techniques. 

In another embodiment, the EMU expression vector is a yeast expression vector. 
Examples of vectors for expression in yeast .S'. cerivisae include pYepSecl (Baldari, et 
al., (1987) Emho J. 6:229-234), pMFa (Kurjan and Herskowitz, (1982) Cell 30:933- 
943), pJRY88 (Schultzet al., (1987) Gene 54:1 13-123), and pYES2 (Invitrogen 
Corporation, San Diego, CA). 
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Alternatively, EMU can be expressed in insect cells using baculovirus 
expression vectors. Baculovirus vectors available for expression of proteins in cultured 
insect cells (e.g., St 9 cells) include the pAc series (Smith et al. (1983) Mot. Cell Biol. 
3:2156-2165) and the pVL series (Lucklov^ and Summers (1989) Virolo^ 170:31-39). 
5 In yet another embodiment, a nucleic acid of the invention is expressed in 

mammalian cells using a mammalian expression vector. Examples of mammalian 
expression vectors include pCDM8 (Seed, B, (1987) Naiure 329:840) and pMT2PC 
(Kaufman et al. (1987) EMBO J. 6:187-195). When used in mammalian cells, the 
expression vector's control functions are often provided by viral regulatory elements. 

10 For example, commonly used promoters are derived from polyoma. Adenovirus 2, 
cytomegalovirus and Simian Virus 40. For other suitable expression systems for both 
prokaryotic and eukaryotic cells sec chapters 16 and 17 of Sambrook, J., Fritsh, E. F., 
and Maniatis, T. Molecular Cloning: A fMhoratory Manual. 2mi ecL, Cold Spring 
Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 

15 1989. 

In another embodiment, the recombinant mammalian expression vector is 
capable of directing expression of the nucleic acid preferentially in a particular cell type 
(e.g., tissue-specific regulatory elements are used to express the nucleic acid). Tissue- 
specific regulatory elements are known in the art. Non-limiting examples of suitable 

20 tissue-specific promoters include the albumin promoter (liver-specific; Pinkert et al. 

(1987) Genes Dev. 1 :268-277), lymphoid-specific promoters (Calame and Eaton (1988) 
Adv. Immunol. 43:235-275), in particular promoters of T cell receptors (Winoto and 
Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins (Banerji et al. (1983) Cell 
33:729-740; Queen and Baltimore (1983) Cell 33:741-748), neuron-specific promoters 

25 (e.g., the neurofilament promoter; Byrne and Ruddle ( 1 989) PNAS 86:5473-5477), 

pancreas-specific promoters (Edlund et al. (1985) Science 230:912-916), and mammary 
gland-specific promoters (e.g.^ milk whey promoter; U.S. Patent No. 4,873,316 and 
European Application Publication No. 264,166). Developmental ly-regulated promoters 
are also encompassed, for example the murine hox promoters (Kessel and Gruss (1990) 

30 Science 249:374-379) and the a-fetoprotein promoter (Campes and Tilghman (1989) 
Genes Dev. 3:537-546). 

The invention further provides a recombinant expression vector comprising a 
DNA molecule of the invention cloned into the expression vector in an antisense 
orientation. That is, the DNA molecule is operatively linked to a regulatory sequence in 

35 a manner which allows for expression (by transcription of the DNA molecule) of an 
RNA molecule which is antisense to EMI I inRNA. Regulatory sequences operatively 
linked to a nucleic acid cloned in the antisense orientation can be chosen which direct 
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the continuous expression of the antisense RNA molecule in a variety of cell typos, for 
instance viral promoters and/or enhancers, or regulatory sequences can be chosen which 
direct constitutive, tissue specillc or cell type specific expression of antisense RNA. The 
antisense expression vector can be in the form of a recombinant plasmid, phagemid or 
5 attenuated virus in which antisense nucleic acids are produced under the control of a 
high efficiency regulatory region, the activity of which can be determined by the cell 
type into which the vector is introduced. For a discussion of the regulation of gene 
expression using antisense genes see Weintraub, H. et al., Antisen.se RNA as a molecular 
tool for genetic analysis. Reviews - Trench in Genetics, Vol. 1(1) 1986. 

1 0 Another aspect of the invention pertains to host cells into which a recombinant 

expression vector of the invention has been introduced. The terms "host cell" and 
"recombinant host cell" are used interchangeably herein. It is understood that such 
terms refer not only to the particular subject cell but to the progeny or potential progeny 
of such a cell. Because certain modifications may occur in succeeding generations due 

1 5 to either mutation or environmental influences, such progeny may not, in fact, be 
identical to the parent cell, but are still included within the scope of the term as used 
herein. 

A host cell can be any prokaryotic or eukaryotic cell. For example, EMI 1 
protein can be expressed in bacterial cells such as E. coli. insect cells, yeast or 
20 mammalian cells (such as Chinese hamster ovary cells (CHO) or COS cells). Other 
suitable host cells are known to those skilled in the art. 

Vector DNA can be introduced into prokaryotic or eukaryotic cells via 
conventional transformation or transfection techniques. As used herein, the terms 
"transformation" and "transfection" are intended to refer to a variety of art-recognized 
25 tecliniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including 
calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated 
transfection, lipofection, or elcctroporation. Suitable methods for transforming or 
transfecting host cells can be found in Sambrook, et al. (Molecukir Cloning: A 
Laboratory Manual. 2nd. ed. Cold Spring Harbor Laboratory. Cold Spring Harbor 
30 Laboratory Press, Cold Spring Harbor. NY, 1 989), and other laboratory manuals. 

For stable transfection of mammalian cells, it is known that, depending upon the 
expression vector and transfection technique used, only a small fraction of cells may 
integrate the foreign DNA into their genome. In order to identify and select these 
integrants, a gene that encodes a selectable marker (e.g., resistance to antibiotics) is 
35 generally introduced into the host cells along with the gene of interest. Preferred 
selectable markers include those which confer resistance to drugs, such as G4I8, 
hygromycin and methotrexate. Nucleic acid encoding a selectable marker can be 
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introduced into a host cell on the same vector as that encoding EMIl or can be 
introduced on a separate vector. Cells stably transfected with the introduced nucleic 
acid can be identified by drug selection (e.g., cells that have incorporated the selectable 
marker gene will survive, while the other cells die). 
5 A host cell of the invention, such as a prokaryotic or eukaryotic host cell in 

culture, can be used to produce (i.e., express) EMI I protein. Accordingly, the invention 
further provides methods for producing EMI 1 protein using the host cells of the 
invention. In one embodiment, the method comprises culturing the host cell of 
invention (into which a recombinant expression vector encoding EMIl has been 
10 introduced) in a suitable medium until EMIl is produced. In another embodiment, the 
method further comprises isolating EMIl from the medium or the host cell. 

The host cells of the invention can also be used to produce nonhuman transgenic 
animals. The nonhuman transgenic animals can be used in screening assays designed to 
identify agents or compounds, e.g., drugs, pharmaceuticals, etc., which are capable of 

15 ameliorating detrimental symptoms of selected disorders such as cardiovascular 

disorders and proliferative disorders. For example, in one embodiment, a host cell of the 
invention is a fertilized oocyte or an embryonic stem cell into which EMIl -coding 
sequences have been introduced. Such host cells can then be used to create non-human 
transgenic animals in which exogenous EMIl sequences have been introduced into their 

20 genome or homologous recombinant animals in which endogenous EMI 1 sequences 
have been altered. Such animals are useful for studying the function and/or activity of 
EMI 1 and for identifying and/or evaluating modulators of EMI 1 activity. As used 
herein, a "transgenic animal" is a nonhuman animal, preferably a mammal, more 
preferably a rodent such as a rat or mouse, in which one or more of the cells of the 

25 animal includes a transgene. Other examples of transgenic animals include nonhuman 
primates, sheep, dogs, cows, goats, chickens, amphibians, etc. A transgene is exogenous 
DNA which is integrated into the genome of a cell from which a transgenic animal 
develops and which remains in the genome of the mature animal, thereby directing the 
expression of an encoded gene product in one or more cell types or tissues of the 

30 transgenic animal. As used herein, a "homologous recombinant animal" is a nonhuman 
animal, preferably a mammal, more preferably a mouse, in which an endogenous EMI 1 
gene has been altered by homologous recombination between the endogenous gene and 
an exogenous DNA molecule introduced into a cell of the animal, e.g., an embryonic 
cell of the animal, prior to development of the animal. 

35 A transgenic animal of the invention can be created by introducing EMI l- 

encoding nucleic acid into the male pronuclei of a fertilized oocyte, e.g., by 
microinjection, retroviral infection, and allowing the oocyte to develop in a 
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pseudopregnant female foster animal. The human EMU cDNA sequence of SEQ ID 
NO: 1 can be introduced as a transgene into the genome of a nonhuman animal. 
Alternatively, a nonhuman homologue of the human EMI 1 gene, such as a mouse EMI I 
gene, can be isolated based on hybridization to the human EMI I cDNA (described 
further in subsection I above) and used as a transgene. Intronic .sequences and 
polyadenylation signals can also be included in the transgene to increase the efficiency 
of expression of the transgene. A tissue-specific regulatory sequence(s) can be operabiy 
linked to the EMU transgene to direct expression of EMU protein to particular cells. 
Methods for generating transgenic animals via embryo manipulation and microinjection, 
particularly animals such as mice, have become conventional in the art and are 
described, for example, in U.S. Patent Nos. 4,736,866 and 4,870,009, both by Leder et 
al., U.S. Patent No. 4.873,191 by Wagner et al. and in Hogan, B., Manipulaiing the 
Mouse Embryo. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 
1 986). Similar methods are used for production of other transgenic animals. A 
transgenic founder animal can be identified based upon the pre.sence of the EMIl 
transgene in its genome and/or expression of EMI 1 mRNA in tissues or cells of the 
animals. A transgenic founder animal can then be used to breed additional animals 
carrying the transgene. Moreover, transgenic animals carrying a transgene encoding 
EMIl can further be bred to other transgenic animals carrying other transgenes. 

To create a homologous recombinant animal, a vector is prepared which contains 
at least a portion of an EMU gene into which a deletion, addition or substitution has 
been introduced to thereby alter, e.g., functionally disrupt, the EMI 1 gene. The EMI 1 
gene can be a human gene (e.g., from a human genomic clone isolated from a human 
genomic library screened with the cDNA of SEQ ID NO: 1). but more preferably, is a 
nonhuman homologue of a human EMIl gene. For example, a mouse EMIl gene can be 
isolated from a mouse genomic DNA library using the human EMI I cDNA of SEQ ID 
NO: 1 as a probe. The mouse EMI I gene then can be used to construct a homologous 
recombination vector suitable for altering an endogenous EMI 1 gene in the mouse 
genome. In a preferred embodiment, the vector is designed such that, upon homologous 
recombination, the endogenous EMIl gene is functionally disrupted (i.e., no longer 
encodes a functional protein; al.so referred to as a "knock out" vector). Altematively, the 
vector can be designed such that, upon homologous recombination, the endogenous 
EMI 1 gene is mutated or otherwise altered but still encodes functional protein (e.g., the 
upstream regulatory region can be altered to thereby alter the expression of the 
endogenous EMIl protein). In the homologous recombination vector, the altered 
portion of the EMIl gene is flanked at its 5' and 3' ends by additional nucleic acid of the 
EMIl gene to allow for homologous recombination to occur between the exogenous 
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EMI 1 gene carried by the vector and an endogenous EMI I gene in an embryonic stem 
cell. The additional Hanking EMU nucleic acid is ot sufficient length for successful 
homologous recombination with the endogenous gene. Typically, several kilobases of 
flanking DNA (both at the 5' and 3' ends) are included in the vector (see e.g., Thomas, 
5 K.R. and Capecchi, M. R. (1987) Ceil 5 1 :503 for a description of homologous 

recombination vectors). The vector is introduced into an embryonic stem cell line (e.g., 
by electroporation) and cells in which the introduced EMU gene has homologously 
recombined with the endogenous EMIl gene are selected (see e.g., Li, E. et al. (1992) 
Cell 69:915). The selected cells are then injected into a blastocyst of an animal (e.g., a 
10 mouse) to fonn aggregation chimeras (see e.g., Bradley, A. in Teratocarcinomas and 
Embryonic Stem Cells: A Practical Approach, E.J. Robertson, ed. (IRL, Oxford, 1987) 
pp. 1 13-152). A chimeric embryo can then be implanted into a suitable pseudopregnant 
female foster animal and the embryo brought to temn. Progeny harboring the 
homologously recombined DNA in their germ cells can be used to breed animals in 
1 5 which all cells of the animal contain the homologously recombined DNA by germline 
transmission of the transgene. Methods for constructing homologous recombination 
vectors and homologous recombinant animals are described further in Bradley, A. 
(1991 ) Current Opinion in Biotechnology 2:823-829 and in PCT International 
Publication Nos.: WO 90/1 1354 by Le Mouellec et al.: WO 91/01 140 by Smithies et 
20 al.; WO 92/0968 by Zijlstra et al.; and WO 93/04 1 69 by Bems et al. 

In another embodiment, transgenic nonhumans animals can be produced which 
contain selected systems which allow for regulated expression of the transgene. One 
example of such a system is the crc/loxP recombinase system of bacteriophage PL For 
a description of the cre/loxP recombinase system, see, e.g., Lakso et al. (1992) PNAS 
25 89:6232-6236. Another example of a recombinase system is the FLP recombinase 

system of Saccharomyces cerevisiae (0*Gorman et al. ( 1991 ) Science 25 1 : 1351-1355. If 
a cre/loxP recombinase system is used to regulate expression of the transgene, animals 
containing transgenes encoding both the Cre recombinase and a selected protein are 
required. Such animals can be provided through the construction of "double'* transgenic 
30 animals, e.g., by mating two transgenic animals, one containing a transgene encoding a 
selected protein and the other containing a transgene encoding a recombinase. 

Clones of the nonhuman transgenic animals described herein can also be 
produced according to the methods described in Wilmut, 1. et al. (1997) Nature 385:810- 
813. In brief, a cell, e.g., a somatic cell, from the transgenic animal can be isolated and 
35 induced to exit the growth cycle and enter Gq phase. The quiescent cell can then be 

fused, e.g., through the use of electrical pulses, to an enucleated oocyte from an animal 
of the same species from which the quiescent cell is isolated. The reconstructed oocyte 
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is then cultured such that it develops to morula or blastocyst and then transferred to 
pscudopregnant female foster animal. The offspring borne of this female foster animal 
will be a clone of the animal from which the cell, e.g., the somatic cell, is isolated. 

5 III. Isolated EMU Proteins and Anti-EMI 1 Antibodies 

Another aspect of the invention pertains to isolated EMU proteins, and 
biologically active portions thereof, as well as peptide fragments suitable for use as 
immunogens to rai.se anti-EMIl antibodies. An "isolated" or "purified" protein or 
biologically active portion thereof is substantially free of cellular material when 
1 0 produced by recombinant DNA techniques, or chemical precursors or other chemicals 
when chemically synthesized. The language "substantially free of cellular material" 
includes preparations of EMU protein in which the protein is separated from cellular 
components of the cells in which it is naturally or recombinantly produced. In one 
embodiment, the language "substantially free of cellular material" includes preparations 
15 of EMI 1 protein having less than about 30% (by dry weight) of non-EMI 1 protein (also 
referred to herein as a "contaminating protein"), more preferably less than about 20% of 
non-EMIl protein, still more preferably less than about 10% of non-EMI 1 protein, and 
most preferably less than about 5% non-EMIl protein. When the EMU protein or 
biologically active portion thereof is recombinantly produced, it is also preferably 
20 substantially free of culture medium, i.e.. culture medium represents less than about 
20%. more preferably less than about 10%, and most preferably less than about 5% of 
the volume of the protein preparation. The language "substantially free of chemical 
precursors or other chemicals" includes preparations of EMI 1 protein in which the 
protein is separated from chemical precursors or other chemicals which are involved in 
25 the synthesis of the protein. In one embodiment, the language "substantially free of 
chemical precursors or other chemicals" includes preparations of EMU protein having 
less than about 30% (by dry weight) of chemical precursors or non-EMI 1 chemicals, 
more preferably less than about 20% chemical precursors or non-EMIl chemicals, still 
more preferably less than about 1 0% chemical precursors or non-EMI 1 chemicals, and 
30 most preferably less than about 5% chemical precursors or non-EMIl chemicals. In 
preferred embodiments, isolated proteins or biologically active portions thereof lack 
contaminating proteins from the same animal from which the EMU protein is derived. 
T ypically, such proteins are produced by recombinant expression of. for example, a 
human EMU protein in a nonhuman cell. 
35 An isolated EMU protein or a portion thereof of the invention can modulate a 

TCjF-P response in a TGF-p responsive cell. In preferred embodiments, the protein or 
portion thereof comprises an amino acid sequence which is sufficiently homologous to 
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an amino acid sequence of SEQ ID N():2 such that the protein or portion thereof 
maintains the ability to modulate a TGF-f^ response in a *I'GF-P responsive cell. The 
portion of the protein is preferably a biologically active portion as described herein. In 
another preferred embodiment, the EMU protein (i.e., amino acid residues 1-335) has an 
5 amino acid sequence shown in SEQ ID NO:2 or an amino acid sequence which is 
encoded by the nucleotide sequence of the DNA insert of the plasmid deposited with 
A fCC as Accession Number 98375. In yet another prefened embodiment, the EMI 1 
protein has cm amino acid sequence which is encoded by a nucleotide sequence which 
hybridizes, e.g., hybridizes under stringent conditions, to the nucleotide sequence of the 

10 DNA insert of the plasmid deposited with ATCC as Accession Number 98375. In still 
another preferred embodiment, the EMU protein has an amino acid sequence which is 
encoded by a nucleotide sequence that is at least about 60-65%, preferably at least about 
70-75%, more preferably at least about 80-85%, and even more preferably at least about 
90-95% or more homologous to the nucleotide sequence of the DNA insert of the 

1 5 plasmid deposited with ATCC as Accession Number 98375. The preferred EMI I 
proteins of the present invention also preferably possess at least one of the EMU 
activities described herein. For example, a preferred EMI 1 protein of the present 
invention includes an amino acid sequence encoded by a nucleotide sequence which 
hybridizes, e.g., hybridizes under stringent conditions, to the nucleotide sequence of the 

20 DNA insert of the plasmid deposited with ATCC as Accession Number 98375 and 
which can modulate a TGF-P response in a TGF-P responsive cell. 

In other embodiments, the EMU protein is substantially homologous to the 
amino acid sequence of SEQ ID NO:2 and retains the functional activity of the protein 
of SEQ ID NO:2 yet differs in amino acid sequence due to natural allelic variation or 

25 mutagenesis, as described in detail in subsection I above. Accordingly, in another 
embodiment, the EMI 1 protein is a protein which comprises an amino acid sequence 
which is at least about 60-70%, preferably at least about 80-85%, and more preferably at 
least about 86, 88, 90%. and most preferably at least about 90-95% or more homologous 
to the entire amino acid sequence of SEQ ID NO:2 and which has at least one of the 

30 EMI I activities described herein. In other embodiment, the invention pertains to a full 
length human protein which is substantially homologous to the entire amino acid 
sequence of SEQ ID NO:2 

Biologically active portions of the EMI 1 protein include peptides comprising 
amino acid sequences derived from the amino acid sequence of the EMI I protein, e.g., 

35 the amino acid sequence shown in SEQ ID NO:2 or the amino acid sequence of a protein 
homologous to the EMU protein, which include less amino acids than the full length 
EMIl protein or the full length protein which is homologous to the EMI I protein, and 
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exhibit at least one activity of the F^Mll protein. Typically, biologically active portions 
(peptides, e.g.. peptides which are, tor example, 5, 10, 15, 20, 30, 35, 36, 37. 38. 39. 40. 
50. 100 or more amino acids in length) comprise a domain or motif, e.g., a WW domain, 
with at lea.st one activity of the EMI I protein. Preferably, the domain is a WW domain 
5 derived from a human and is at least about 55%, preferably at least about 60-65%, even 
more preferably at least about 70-75%. and most preferably at least about 80-90% or 
more homologous to SEQ ID NO:4. If the WW domain is derived from a nonmammal. 
it is preferably at least about 75%, preferably at least about 80-85%. and most preferably 
at least about 90-95% or more homologous to SEQ ID NO:4. In a preferred 
1 0 embodiment, the biologically active portion of the protein which includes the WW 
domain can modulate the activity of a protein having a PY motif and/or modulate a 
TGF-p response in a TGF-\i responsive cell. e.g.. an endothelial cell, to thereby 
beneficially affect the TGF-p responsive cell. In a preferred embodiment, the 
biologically active portion comprises the WW domain of EMU as represented by amino 
15 acid residues 300 to 335 of SEQ ID NO:2 and SEQ ID NO:4. Moreover, other 

biologically active portion.s, in which other regions of the protein are deleted, can be 
prepared by recombinant techniques and evaluated for one or more of the activities 
described herein. Preferably, the biologically active portions of the EMU protein 
include one or more selected domains/motifs or portions thereof having biological 
20 activity. 

EMIl proteins are preferably produced by recombinant DNA tecliniques. For 
example, a nucleic acid molecule encoding the protein is cloned into an expression 
vector (as described above), the expression vector is introduced into a host cell (as 
described above) and the EMI 1 protein is expressed in the host cell. The EMI 1 protein 

25 can then be i.solated from the cells by an appropriate purification scheme using standard 
protein purification techniques. Alternative to recombinant expression, an EMIl 
protein, polypeptide, or peptide can be synthesized chemically using standard peptide 
synthesis techniques. Moreover, native EMIl protein can be isolated from cells (e.g., 
endothelial cells), for example using an anti-EMI 1 antibody (described further below). 

30 The invention also provides EMI I chimeric or ftision proteins. As used herein, 

an EMIl "chimeric protein" or "fusion protein" comprises an EMIl polypeptide 
operatively linked to a non-EMIl polypeptide. An "EMIl polypeptide" refers to a 
polypeptide having an amino acid sequence corresponding to EMIl, whereas a "non- 
EMI 1 polypeptide" refers to a polypeptide having an amino acid sequence 
35 corresponding to a protein which is not substantially homologous to the EMI I protein, 
e.g., a protein which is different from the EMI I protein and which is derived from the 
.same or a different organism. Within the fusion protein, the term "operatively linked" is 
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intended lo indicate that the tMl 1 polypeptide and the non-EMI 1 polypeptide are Fused 
in-lVame to each other. The non-EMI 1 polypeptide can be fused to the N-terniinus or C- 
tcmiinus of the EMI I polypeptide. For example, in one embodiment the fusion protein 
is a GST-EMIl fusion protein in which the EMIl sequences arc fused to the C-terminus 
5 of the GST sequences. Such fusion proteins can facilitate the purification of 

recombinant EMIl. In another embodiment, the fusion protein is an EMIl protein 
containing a heterologous signal sequence at its N-terminus. In certain host cells (e.g., 
mammalian host cells), expression and/or secretion of EMI 1 can be increased through 
use of a heterologous signal sequence. 

10 Preferably, an EMIl chimeric or fusion protein of the invention is produced by 

standard recombinant DN A techniques. For example, DNA fragments coding for the 
different polypeptide sequences are ligated together in-frame in accordance with 
conventional techniques, for example by employing blunt-ended or stagger-ended 
termini for ligation, restriction enzyme digestion to provide for appropriate termini, 

15 filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to avoid 

undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene 
can be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively. PGR amplification of gene fragments can be carried out using anchor 
primers which give rise to complementary overhangs between two consecutive gene 

20 fragments which can subsequently be annealed and reamplified to generate a chimeric 
gene sequence (see, for example, CurrcrU Protocols in Molecular Biology^ eds. Ausubel 
et al. John Wiley & Sons: 1992). Moreover, many expression vectors are commercially 
available that already encode a fusion moiety (e.g., a GST polypeptide). An EMI l- 
encoding nucleic acid can be cloned into such an expression vector such that the fusion 

25 moiety is linked in-frame to the EMIl protein. 

The present invention also pertains to homologues of the EMIl proteins which 
function as either an EMI I agonist (mimetic) or an EMI 1 antagonist. In a preferred 
embodiment, the EMIl agonists and antagonists stimulate or inliibit, respectively, a 
subset of the biological activities of the naturally occurring form of the EMIl protein. 

30 Thus, specific biological effects can be elicited by treatment with a homologue of 

limited function. In one embodiment, treatment of a subject with a homologue having a 
subset of the biological activities of the naturally occurring form of the protein has fewer 
side effects in a subject relative to treatment with the naturally occurring form of the 
EMIl protein. 

35 Homologues of the EMI 1 protein can be generated by mutagenesis, e.g., discrete 

point mutcition or truncation of the EMIl protein. As used herein, the term "homologue" 
refers to a variant fonn of the EMI 1 protein which acts as an agonist or antagonist of the 
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activity of the HMI 1 protein. An agonist of the EMU protein can retain substantially the 
same, or a subset, of the biological activities of the EMU protein. An antagonist of the 
EMU protein can inhibit one or more of the activities of the naturally occurring form of 
the EMU protein, by. for example, competitively binding to a downstream or upstream 
member of the EMU cascade which includes the EMM protein. Thus, the mammalian 
EMll protein and homologues thereof of the present invention can be either positive or 
negative regulators of TGF-fi responses in TGF-P responsive cells. 

In an alternative embodiment, homologues of the EMI I protein can be identified 
by screening combinatorial libraries of mutants, e.g., truncation mutants, of the EMI I 
protein for EMI 1 protein agonist or antagonist activity. In one embodiment, a 
variegated library of EMI 1 variants is generated by combinatorial mutagenesis at the 
nucleic acid level and is encoded by a variegated gene library. A variegated library of 
EMU variants can be produced by, for example, enzymatically ligating a mixture of 
synthetic oligonucleotides into gene sequences such that a degenerate set of potential 
5 EMI 1 sequences is expressible as individual polypeptides, or alternatively, as a set of 
larger fusion proteins (e.g., for phage display) containing the set of EMU sequences 
therein. There are a variety of methods which can be used to produce libraries of 
potential EMU homologues from a degenerate oligonucleotide sequence. Chemical 
synthesis of a degenerate gene sequence can be performed in an automatic DNA 
0 synthesizer, and the synthetic gene then ligated into an appropriate expression vector. 
Use of a degenerate set of genes allows for the provision, in one mixture, of all of the 
sequences encoding the desired set of potential EMI 1 sequences. Methods for 
synthesizing degenerate oligonucleotides are known in the art (see, e.g., Narang, S.A. 
(1983) Tetrahedron 39:3; Itakura et al. (1984) Anmi. Rev. Binchem. 53:323; Itakura et al. 
5 ( 1 984) Science 1 98: 1 056; Ike ct al. ( 1 983) Nucleic Acid Res. 1 1 :477. 

In addition, libraries of fragments of the EMI 1 protein coding can be used to 
generate a variegated population of EMN fragments for screening and subsequent 
selection of homologues of an EMU protein. In one embodiment, a library of coding 
sequence fragments can be generated by treating a double stranded PGR fragment of an 
EMU coding sequence with a nuclease under conditions wherein nicking occurs only 
about once per molecule, denaturing the double stranded DNA, renaturing the DNA to 
form double stranded DNA which can include sense/antisense pairs from different 
nicked products, removing single stranded portions from reformed duplexes by 
treatment with S I nuclease, and ligating the resulting fragment library into an expression 
vector. By this method, an expression library can be derived which encodes N-terminal, 
C-terminal and internal fragments of various sizes of the EMIl protein. 
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Several techniques are known in the art tbr screening gene products of 
combinatorial libraries made by point mutations or truncation, and for screening cDN A 
libraries for gene products having a selected property. Such techniques are adaptable for 
rapid screening of the gene libraries generated by the combinatorial mutagenesis of 
5 EMI 1 homologues. The most widely used techniques, which are amenable to high 
through-put analysis, for screening large gene libraries typically include cloning the 
gene library into rcplicable expression vectors, transforming appropriate cells with the 
resulting library of vectors, and expressing the combinatorial genes under conditions in 
which detection of a desired activity facilitates isolation of the vector encoding the gene 
10 whose product was detected. Recrusive ensemble mutagenesis (REM), a new teclinique 
which enhances the frequency of functional mutants in the libraries, can be used in 
combination with the screening assays to identify EMI 1 homologues (Arkin and 
Yourvan (1992) PNAS S9:7&ll'7Sl5: Delgrave et al. (1993) Proiein Engineering 
6(3):327-331). 

15 In one embodiment, cell based assays can be exploited to analyze a variegated 

EMI 1 library. For example, a library of expression vectors can be transfected into a cell 
line ordinarily responsive to a particular TGF-p. The transfected cells are then contacted 
with the TGF-P and the effect of the EMI 1 mutant on signaling by TGF-P can be 
detected, e.g., by measuring ^[HJthymidine incorporation. Plasmid DNA can then be 

20 recovered from the cells which score for inhibition, or alternatively, potentiation of 
TGF-P induction, and the individual clones further characterized. 

An isolated EMIl protein, or a portion or fragment thereof, can be used as an 
immunogen to generate antibodies that bind EMIl using standard techniques for 
polyclonal and monoclonal antibody preparation. The full-length EMIl protein can be 

25 used or, alternatively, the invention provides antigenic peptide fragments of EMI I for 
use as immunogens. The antigenic peptide of EMIl comprises at least 8 amino acid 
residues of the amino acid sequence shown in SEQ ID NO:2 and encompasses an 
epitope of EMI 1 such that an antibody raised against the peptide fomis a specific 
immune complex with EMI 1 . Preferably, the antigenic peptide comprises at least 10 

30 amino acid residues, more preferably at least 1 5 amino acid residues, even more 

preferably at least 20 amino acid residues, and most preferably at least 30 amino acid 
residues. Preferred epitopes encompassed by the antigenic peptide are regions of EMI 1 
that are located on the surface of the protein, e.g.; hydrophilic regions. 

An EMI 1 immunogen typically is used to prepare antibodies by immunizing a 

35 suitable subject, (e.g., rabbit, goat, mouse or other mammal) with the immunogen. An 
appropriate immunogenic preparation can contain, for example, recombinantly 
expressed EMI I protein or a chemically .synthesized EMI I peptide. The preparation can 
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further include an adjuvant, such as Frcund's complete or incomplete adjuvant, or similar 
immunostimulatory agent. Immunization of a suitable sub ject with an immunogenic 
EMI I preparation induces a polyclonal anti-EMI I antibody response. 

Accordingly, another aspect of the invention pertains to anti-EMI 1 antibodies. 
5 The temi ^'antibody" as used herein refers to immunoglobulin molecules and 

immunologically active portions of immunoglobulin molecules, i.e., molecules that 
contain an antigen binding site which specifically binds (immunoreacts with) an antigen, 
such as EMI 1 . Examples of immunologically active portions of immunoglobulin 
molecules include F(ab) and F(ab')2 fragments which can be generated by treating the 
10 antibody with an enzyme such as pepsin. The invention provides polyclonal and 
monoclonal antibodies that bind EMIL The term "monoclonal antibody" or 
"monoclonal antibody composition", as used herein, refers to a population of antibody 
molecules that contain only one species of an antigen binding site capable of 
immunoreacting with a particular epitope of EMI I . A monoclonal antibody composition 
1 5 thus typically displays a single binding affinity for a particular EMI 1 protein with which 
it immunoreacts. 

Polyclonal anti-EMI 1 antibodies can be prepared as described above by 
immunizing a suitable subject with an EMU immunogen. The anti-EMI 1 antibody titer 
in the immunized subject can be monitored over time by standard techniques, such as 

20 with an enzyme linked immunosorbent assay (ELISA) using immobilized EMI I . If 

desired, the antibody molecules directed against EMIl can be isolated from the mammal 
(e.g., from the blood) and further purified by well known techniques, such as protein A 
chromatography to obtain the IgG fraction. At an appropriate time after immunization, 
e.g., when the anti-EMI 1 antibody titers are highest, antibody-producing cells can be 

25 obtained from the subject and used to prepare monoclonal antibodies by standard 
techniques, such as the hybridoma technique originally described by Kohler and 
Milstein (1975) Nature 256:495-497) (see also. Brown et al. (1981),/ ImmunoL 
127:539-46; Brown et al. (1980) ./. Biol. Cham .255:4980-83; Yeh et al. (1976) PNAS 
76:2927-3 1: and Yeh et al. (1982) Int. ,/ Cancer 29:269-75), the more recent human B 

30 cell hybridoma technique (Kozbor et al. ( 1 983) Immunol Today 4:72), the EBV- 

hybridoma technique (Cole et al. (1985), Monoclonal Antibodies and Cancer Therapy. 
Alan R. Liss, Inc., pp. 77-96) or trioma techniques. The technology for producing 
monoclonal antibody hybridomas is well known (see generally R. H. Kenneth, in 
Monoclonal Antibodies: A New Dimension In Biological Analyses, Plenum Publishing 

35 Corp., New York, New York (1980): E. A. Lerner( 1981) Yale J. BioL Med.. 

54:387-402; M. L. Geftcr et al. (1977) Somatic Cell Genet. 3:23 1-36). Briefly, an 
immortal cell line (typically a myeloma) is fused to lymphocytes (typically splenocytes) 
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from a mammal immunized with an EMI 1 immunogcn as described above, and the 
culture supernatants oFthe resulting hybridoma cells are screened to identify a 
hybridoma producing a monoclonal antibody that binds EMU. 

Any of the many well known protocols used for fusing lymphocytes and 
5 immortalized cell lines can be applied for the purpose of generating an anti-EMI 1 

monoclonal antibody (see, e.g., G. Galfre ct al. (1977) Nature 266:55052; Gefter et al. 
Somaiic Cell Genet., cited supra: Lemcr, Vale J. Biol. Med, cited supra: Kenneth, 
Monoclonal Antibodies, cited supra). Moreover, the ordimirily skilled worker will 
appreciate that there are many variations of such methods which also would be useful. 

10 Typically, the immortal cell line (e.g., a myeloma cell line) is derived from the same 
mammalian species as the lymphocytes. For example, murine hybridomas can be made 
by fusing lymphocytes from a mouse immunized with an immunogenic preparation of 
the present invention with an immortalized mouse cell line. Preferred immortal cell 
lines are mouse myeloma cell lines that are sensitive to culture medium containing 

1 5 hypoxanthine, aminopterin and thymidine ("HAT medium"). Any of a number of 
myeloma cell lines can be used as a fusion partner according to standard techniques, 
e.g., the P3-NSl/l-Ag4-l, P3-x63-Ag8.653 or Sp2/0-Agl4 myeloma lines. These 
myeloma lines are available from ATCC. Typically, HAT-.sensitive mouse myeloma 
cells are fiised to mouse splenocytes using polyethylene glycol ("PEG"). Hybridoma 

20 cells resulting from the fusion are then selected using HAT medium, which kills unfused 
and unproductively fused myeloma cells (unfused splenocytes die after several days 
because they are not transfonned). Hybridoma cells producing a monoclonal antibody 
of the invention are detected by screening the hybridoma culture supernatants for 
antibodies that bind EMI 1, e.g., using a standard ELISA assay. 

25 Alternative to preparing monoclonal antibody-secreting hybridomas, a 

monoclonal anti-EMIl antibody can be identified and isolated by screening a 
recombinant combinatorial immunoglobulin library (e.g., an antibody phage display 
library) with EMI 1 to thereby isolate immunoglobulin library members that bind EMI 1 . 
Kits for generating and screening phage display libraries are commercially available 

30 (e.g., the Pharmacia Recombinant Phage Antibody System, Catalog No. 27-9400-01 ; and 
the Stratagene SurfZAP^^ Phage Display Kit, Catalog No. 240612). Additionally, 
examples of methods and reagents particularly amenable for use in generating and 
screening antibody display library can be found in, for example, Ladner et al. U.S. 
Patent No. 5.223,409; Kang et al. PCT International {Publication No. WO 92/18619; 

35 Dower ct al. PCT International Publication No. WO 91/17271; Winter et al. PCT 

International Publication WO 92/20791; Markland et al. PCT International Publication 
No. WO 92/15679; Breitling et al. PCT International Publication WO 93/01288; 
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McCaffcrty et al. FCT International Publication No. WO 92/01047; Garrard et al. PCT 
International Publication No. WO 92/09690; Ladner et al. PCT International Publication 
No. WO 90/02809; Fuchs et al. ( 1 99 1 ) Bio/Tcchno/ogy 9: 1 370- 1 372; Hay et al. ( 1 992) 
Hum. Antihod Hyhridomas 3:81-85; Iluse et al. (1989) Science 246:1275-1281; 
5 Griffiths et al. (1993) EMBO J 12:725-734; Hawkins et al. (1992)./. Mot. Biol. 226:889- 
896; Clarkson et al. (1991) A/a/;/rt' 352:624-628; Gram et al. (1992) PNAS 89:2576- 
3580; Garrad et al. (1991 ) Bio/Technology 9:1373-1377; Hoogenboom et al. (1991 ) Niic. 
Acid Re.s. 1 9:4 1 33-4 137; Barbas et al. ( 1 99 1 ) PNAS 88:7978-7982; and McCaffcrty et 
al. ^aliire {1990) 348:552-554. 
1 0 Additionally, recombinant anti-EMI 1 antibodies, such as chimeric and 

humanized monoclonal antibodies, comprising both human and non-human portions, 
which can be made using standard recombinant DNA techniques, are within the scope of 
the invention. Such chimeric and humanized monoclonal antibodies can be produced by 
recombinant DNA techniques known in the art, for example using methods described in 
1 5 Robinson et al. International Application No. PCT/US86/02269; Akira, et al. European 
Patent Application 184.187; Taniguchi, M., European Patent Application 171,496; 
Morrison et al. European Patent Application 1 73.494; Neuberger et al. PCT International 
Publication No. WO 86/01533; Cabilly et al. U.S. Patent No. 4.816.567; Cabilly et al. 
European Patent Application 125.023; Better et al. (1988) Science 240:1041-1043; Liu 
20 et al. (1987) PNAS S4:3439-3443: Liu et al. (1987)7. Immunol. 139:3521-3526; Sun et 
al. (1987) /'A^4.S' 84:214-218; Nishimura et al. (1987) CVmc. Res. 47:999-1005; Wood et 
al. il9S5) Nature 314:446-449: and Shaw et al. (1988) 7. Nad Cancer Inst. 80:1553- 
1559); Morrison, S. L. (1985) Science 229: 1202-1207; Oi et al. (1986) BioTechniques 
4:214; Winter U.S. Patent 5.225.539; Jones et al. (1986) Nature 321:552-525; 
25 Verhoeyan et al. ( 1 988) Science 239: 1 534; and Beidlcr et al. ( 1 988) ./. Immunol 
141 :4053-4060. 

An anti-EMI 1 antibody (e.g.. monoclonal antibody) can be used to isolate EMI 1 
by standard techniques, such as affinity chromatography or immunoprecipitation. An 
anti-EMI 1 antibody can facilitate the purification of natural EMU from cells and of 

30 recombinantly produced EMU expressed in host cells. Moreover, an anti-EMI 1 

antibody can be used to detect EMI I protein (e.g., in a cellular lysate or cell supernatant) 
in order to evaluate the abundance and pattern of expression of the EMI 1 protein. Anti- 
EMI I antibodies can be used diagnostically to monitor protein levels in tissue as part of 
a clinical testing procedure, e.g.. to. for example, determine the efficacy of a given 

35 treatment regimen. Detection can be lacilitated by coupling (i.e., physically linking) the 
antibody to a detectable substance. Examples of detectable substances include various 
enzymes, prosthetic groups, fluorescent materials, luminescent materials. 
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bioluminesccnt materials, and radioactive materials. Examples of suitable enzymes 
include horseradish peroxidase, alkaline phosphatase, P-galactosidase, or 
acetylcholinesterase; examples of suitable prosthetic group complexes include 
streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include 
5 umbelliferone. fluorescein, fluorescein isothiocyanate, rhodamine, 

dichlorotriaziny [amine fluorescein, dansyl chloride or phycoerythrin; an example of a 
luminescent material includes luminol; examples of bioluminesccnt materials include 
luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 
^-^Sor^H. 

10 

IV. Pharmaceutical Compositions 

The EMll nucleic acid molecules, EMU proteins, EMU modulators, and anti- 
EMIl antibodies (also referred to herein as "active compounds") of the invention can be 
incorporated into pharmaceutical compositions suitable for administration to a subject. 

15 e.g., a human. Such compositions typically comprise the nucleic acid molecule, protein, 
modulator, or antibody and a pharmaceutically acceptable carrier. As used herein the 
language "pharmaceutically acceptable carrier" is intended to include any and all 
solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and 
absorption delaying agents, and the like, compatible with pharmaceutical administration. 

20 The use of such media and agents for pharmaceutically active substances is well known 
in the art. Except insofar as any conventional media or agent is incompatible with the 
active compound, such media can be used in the compositions of the invention. 
Supplementary active compounds can also be incorporated into the compositions. 

A pharmaceutical composition of the invention is formulated to be compatible 

25 with its intended route of administration. Examples of routes of administration include 
parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inlialation), 
transdermal (topical), transmucosal, and rectal administration. Solutions or suspensions 
used for parenteral, intradermal, or subcutaneous application can include the following 
components: a sterile diluent such as water for injection, saline solution, fixed oils, 

30 polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; 

antibacterial agents such as benzyl alcohol or methyl parabens: antioxidants such as 
ascorbic acid or sodium bisulfite: chelating agents such as ethylenediaminetetraacetic 
acid; buffers such as acetates, citrates or phosphaites and agents for the adjustment of 
tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, 

35 such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be 
enclosed in ampoules, disposable syringes or multiple dose vials made of glass or 
plastic. 
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Pharmaceutical compositions suitable for injectable use include sterile aqueous 
solutions (where water soluble) or dispersions and sterile powders For the 
extemporaneous preparation of sterile injectable solutions or dispersion. For 
intravenous administration, suitable carriers include physiological saline, bacteriostatic 
5 water, Cremophor EI ™ (BASF, Parsippany, NJ) or phosphate buffered saline (PBS). In 
all cases, the composition must be sterile and should be fluid to the extent that easy 
syringability exists. It must be stable under the conditions of manufacture and storage 
and must be preserved against the contaminating action of microorganisms such as 
bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for 
0 example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid 
polyetheylene glycol, and the like), and suitable mixtures thereof The proper fluidity 
can be maintained, for example, by the use of a coating such as lecithin, by the 
maintenance of the required particle size in the case of dispersion and by the use of 
surfactants. Prevention of the action of microorganisms can be achieved by various 
antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, 
ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include 
isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium 
chloride in the composition. Prolonged absorption of the injectable compositions can be 
brought about by including in the composition an agent which delays absorption, for 
example, aluminum monostearate and gelatin. 

Sterile injectable solutions can be prepared by incorporating the active 
compound (e.g., an EMU protein or anti-EMIl antibody) in the required amount in an 
appropriate solvent with one or a combination of ingredients enumerated above, as 
required, followed by filtered sterilization. Generally, dispersions are prepared by 
incorporating the active compound into a sterile vehicle which contains a basic 
dispersion medium and the required other ingredients from those enumerated above. In 
the case of sterile powders for the preparation of sterile injectable solutions, the 
preferred methods of preparation are vacuum drying and freeze-drying which yields a 
powder of the active ingredient plus any additional desired ingredient from a previously 
sterile-filtered solution thereof. 

Oral compositions generally include an inert diluent or an edible carrier. They 
can be enclosed in gelatin capsules or compressed into tablets. For the purpose of oral 
therapeutic administration, the active compound can be incorporated with excipients and 
used in the form of tablets, troches, or capsules. Oral compositions can also be prepared 
using a fluid carrier for use as a mouthwash, wherein the compound in the fluid carrier is 
applied orally and swished and expectorated or swallowed. Pharmaceuticaily 
compatible binding agents, and/or adjuvant materials can be included as part of the 
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composition. The tablets, pills, capsules, troches and the like can contain any of the 
following ingredients, or compounds of a similar nature: a binder such as 
niicrocrystalline cellulose, gum tragacanth or gelatin: an excipient such as starch or 
lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant 
5 such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a 
sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, 
methyl salicylate, or orange flavoring. 

For administration by inhalation, the compounds are delivered in the form of an 
aerosol spray from pressured container or dispenser which contains a suitable propellant, 
1 0 e.g., a gas such as carbon dioxide, or a nebulizer- 
Systemic administration can also be by transmucosal or transdermal means. For 
transmucosal or transdermal administration, penetrants appropriate to the barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art. 
and include, for example, for transmucosal administration, detergents, bile salts, and 
15 fusidic acid derivatives. Transmucosal administration can be accomplished through the 
use of nasal sprays or suppositories. For transdermal administration, the active 
compounds are formulated into ointments, salves, gels, or creams as generally known in 
the art. 

The compounds can also be prepared in the form of suppositories (e.g., with 

20 conventional suppository bases such as cocoa butter and other glycerides) or retention 
enemas for rectal delivery. 

In one embodiment, the active compounds are prepared with carriers that will 
protect the compound against rapid elimination from the body, such as a controlled 
release fonnulation, including implants and microencapsulated delivery systems. 

25 Biodegradable, biocompatible polymers can be used, such as ethylene vinyl acetate, 
polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and polylactic acid. 
Methods for preparation of such formulations will be apparent to those skilled in the art. 
The materials can also be obtained commercially from Alzii Corporation and Nova 
Pharmaceuticals, Inc. Liposomal suspensions (including liposomes targeted to infected 

30 cells with monoclonal antibodies to viral antigens) can also be used as phamiaceuticalh 
acceptable carriers. These can be prepared according to methods known to those skilled 
in the art, for example, as described in U.S. Patent No. 4,522,81 1. 

It is especially advantageous to formulate oral or parenteral compositions in 
dosage unit form for ease of administration and unifonnity of dosage. Dosage unit fomi 

35 as used herein refers to physically discrete units suited as unitary dosages for the subject 
to be treated: each unit containing a predetenriined quantity of active compound 
calculated to produce the desired therapeutic effect in association with the required 
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pharmaceutical carrier. The specification lor the dosage unit fomis of the invention are 
dictated by and directly dependent on the unique characteristics of the active compound 
cind the particular therapeutic effect to be achieved, and the limitations inherent in the art 
of compounding such an active compound for the treatment of individuals. 
5 The nucleic acid molecules of the invention can be inserted into vectors and used 

as gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for 
example, intravenous injection, local administration (see U.S. Patent 5,328,470) or by 
stereotactic injection (see e.g., Chen et al. (1994) PA^/l.V 91 :3054-3O57). The 
pharmaceutical prepciration of the gene therapy vector can include the gene therapy 
10 vector in an acceptable diluent, or can comprise a slow release matrix in which the gene 
delivery vehicle is imbedded. Alternatively, where the complete gene delivery vector 
can be produced intact from recombinant cells, e.g. retroviral vectors, the 
pharmaceutical preparation can include one or more cells which produce the gene 
delivery system. 

^ 5 The pharmaceutical compositions can be included in a container, pack, or 

dispenser together with instructions for administration. 



V. Uses and Methods of the Invention 

Ihe nucleic acid molecules, proteins, protein homologucs, modulators, and 

20 antibodies described herein can be used in one or more of the following methods: 1 ) 
drug screening assays; 2) diagnostic assays; and 3) methods of treatment. An EMI 1 
protein of the invention has one or more of the activities described herein and can thus 
be used to, for example, modulate a TOF-fJ response in a TGF-p responsive cell. The 
isolated nucleic acid molecules of the invention can be used to express EMU protein 

25 (e.g., via a recombinant expression vector in a host cell in gene therapy applications), to 
detect EMU mRNA (e.g., in a biological sample) or a genetic lesion in an EMU gene, 
and to modulate EMIl activity, as described further below. In addition, the EMU 
proteins can be used to screen drugs or compounds which modulate EMIl protein 
activity as well as to treat disorders characterized by insufficient production of EMI 1 

30 protein or production of EMI 1 protein forms which have decreased activity compared to 
wild type EMI 1 . Moreover, the anti-EMI 1 antibodies of the invention can be used to 
detect and isolate EMIl protein and modulate EMIl protein activity. 

a. Drug Screening Assays : 

The invention provides methods for identifying compounds or agents which can 
be used to treat disorders characterized by (or associated with) aberrant or abnormal 
EMI nucleic acid expression and/or EMI 1 protein activity. These methods are also 
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referred to herein as drug screening assays and typically include the step of screening a 
candidate/test compound or agent for the ability to interact with (e.g., bind to) an EMI 1 
protein, to modulate the interaction of an EMIl protein and a target molecule, and/or to 
modulate EMI 1 nucleic acid expression and/or EMI 1 protein activity. Candidate/lest 
5 compounds or agents which have one or more of these abilities can be used as drugs to 
treat disorders characterized by aberrant or abnormal EMI nucleic acid expression and/or 
EMIl protein activity. Candidate/test compounds such as small molecules, e.g., small 
organic molecules, and other drug candidates can be obtained, for example, from 
combinatorial and natural product libraries. 

10 In one embodiment, the invention provides assays for screening candidate/test 

compounds which interact with (e.g., bind to) EMIl protein. Typically, the assays are 
cell-tree assays which include the steps of combining an EMIl protein or a biologically 
active portion thereof, and a candidate/test compound, e.g., under conditions which 
allow for interaction of (e.g., binding oO the candidate/test compound to the EMI 1 

1 5 protein or portion thereof to form a complex, and detecting the formation of a complex, 
in which the ability of the candidate compound to interact with (e.g., bind to ) the EMI 1 
protein or portion thereof is indicated by the presence of the candidate compound in the 
complex. Formation of complexes between the EMIl protein and the candidate 
compound can be quantitated, for example, using standard immunoassays. 

20 In another embodiment, the invention provides screening assays to identify 

candidate/test compounds which modulate (e.g., stimulate or inhibit) the interaction (and 
most likely EMIl activity as well) between an EMIl protein and a molecule (target 
molecule) with which the EMIl protein nom^ally interacts. Examples of such target 
molecules includes proteins in the same signaling path as the EMIl protein, e.g., 

25 proteins which may function upstream (including both stimulators and inhibitors of 
activity ) or downstream of the EMI 1 protein in the TGF-P signaling pathway, e.g., an 
MADR protein. Typically, the assays are cell-free assays which include the steps of 
combining an EMI 1 protein or a biologically active portion thereof, an EMI 1 target 
molecule (e.g., an EMI 1 ligand) and a candidate/test compound, e.g., under conditions 

30 wherein but for the presence of the candidate compound, the EMI 1 protein or 

biologically active portion thereof interacts with (e.g., binds to) the target molecule, and 
detecting the formation of a complex which includes the EMIl protein and the target 
molecule or detecting the interaction/reaction of the EMIl protein and the target 
molecule. Detection of complex formation can include direct quantitation of the 

35 complex by, for example, measuring inductive effects of the EMI 1 protein. A 

statistically significant change, such as a decrease, in the interaction of the EMI 1 and 
target molecule (e.g., in the formation of a complex between the EMIl and the target 
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molecule) in the presence of a candidate compound (relative to what is detected in the 
absence of the candidate compound) is indicative of a modulation (e.g., stimulation or 
inhibition) of the interaction between the EMU protein and the target molecule. 
Modulation of the formation of complexes between the EMU protein and the target 
5 molecule can be quantitated using, for example, an immunoassay. 

To perfonn the above drug screening assays, it is desirable to immobilize either 
EMU or its target molecule to facilitate separation of complexes from uncomplexed 
forms of one or both of the proteins, as well as to accommodate automation of the assay. 
Interaction (e.g., binding of) of EMU to a target molecule, in the presence and absence 
10 of a candidate compound, can be accomplished in any vessel suitable for containing the 
reactants. Examples of such vessels include microtitrc plates, test tubes, and micro- 
centrifuge tubes. In one embodiment, a fusion protein can be provided which adds a 
domain that allows the protein to be bound to a matrix. For example, glutathione-.S- 
transferase/ EMI I fusion proteins can be adsorbed onto glutathione sepharose beads 
1 5 (Sigma Chemical, St. Louis, MO) or glutathione derivatized microtitre plates, which are 
then combined with the cell lysates (e.g. 35s-labeled) and the candidate compound, and 
the mixture incubated under conditions conducive to complex formation (e.g.. at 
physiological conditions for salt and pH). Following incubation, the beads are washed 
to remove any unbound label, and the matrix immobilized and radiolabel determined 
20 directly, or in the supernatant after the complexes are dissociated. Alternatively, the 

complexes can be dissociated from the matrix, separated by SDS-PAGE. and the level of 
EMU -binding protein found in the bead fraction quantitated from the gel using standard 
electrophoretic techniques. 

Other techniques for immobilizing proteins on matrices can also be used in the 
25 drug screening assays of the invention. For example, either EMI I or its target molecule 
can be immobilized utilizing conjugation of biotin and streptavidin. Biotinylated EMI 1 
molecules can be prepared from biotin-NHS (TM-hydroxy-succinimide) using techniques 
well known in the art (e.g., biotinylation kit. Pierce Chemicals. Rockford, IL), and 
immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical). 
30 Alternatively, antibodies reactive with EMI 1 but which do not interfere with binding of 
the protein to its target molecule can be derivatized to the wells of the plate, and EMI 1 
trapped in the wells by antibody conjugation. As described above, preparations of a 
EMU -binding protein and a candidate compound are incubated in the EMI 1 -presenting 
wells of the plate, and the amount of complex trapped in the well can be quantitated. 
35 Methods for detecting such complexes, in addition to those described above for the 

GST-immobilized complexes, include immunodetection of complexes using antibodies 
reactive with the EMU target molecule, or which are reactive with EMM protein and 
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compete with the target molecule; as well as enzyme-linked assays which rely on 
detecting an enzymatic activity associated with the target molecule. 

In yet another embodiment, the invention provides a method for identifying a 
compound (e.g., a screening assay) capable of use in the treatment of a disorder 
5 characterized by (or associated with) aberrant or abnormal EMI 1 nucleic acid expression 
or EMI 1 protein activity. This method typically includes the step of assaying the ability 
of the compound or agent to modulate the expression of the EMI 1 nucleic acid or the 
activity of the EMU protein thereby identifying a compound for treating a disorder 
characterized by aberrant or abnormal EMI 1 nucleic acid expression or EMI I protein 

10 activity. Disorders characterized by aberrant or abnormal EMI I nucleic acid expression 
or EMI 1 protein activity are described herein. Methods for assaying the ability of the 
compound or agent to modulate the expression of the EMI 1 nucleic acid or activity of 
the EMI 1 protein are typically cell-based assays. For example, cells which are sensitive 
to ligands, e.g., TGF-(i, which transduce signals via a pathway involving EMIl can be 

15 induced to overexpress an EMI 1 protein in the presence and absence of a candidate 
compound. Candidate compounds which produce a statistically significant change in 
EMIl-dependent responses (either stimulation or inhibition) can be identified. In one 
embodiment, expression of the EMIl'nucleic acid or activity of an EMIl protein is 
modulated in cells and the effects of candidate compounds on the readout of interest 

20 (such as rate of cell proliferation or differentiation) are measured. For example, the 

expression of genes which are up- or down-regulated in response to an EMIl -dependent 
signal cascade can be assayed. In preferred embodiments, the regulatory regions of such 
genes, e.g.. the 5' flanking promoter and enhancer regions, are operably linked to a 
detectable marker (such as luciferase) which encodes a gene product that can be readily 

25 detected. Phosphorylation of EMIl or EMI I target molecules can also be measured, for 
example, by immunoblotting. 

Alternatively, modulators of EMIl expression (e.g., compounds which can be 
used to treat a disorder characterized by aberrant or abnormal EMI 1 nucleic acid 
expression or EMIl protein activity) can be identified in a method wherein a cell is 

30 contacted with a candidate compound and the expression of EMI 1 mRNA or protein in 
the cell is determined. The level of expression of EMIl mRNA or protein in the 
presence of the candidate compound is compared to the level of expression of EMIl 
mRNA or protein in the absence of the candidate compound. The candidate compound 
can then be identified as a modulator of EMI I nucleic acid expression based on this 

35 comparison and be used to treat a disorder characterized by aberrant EMI 1 nucleic acid 
expression. For example, when expression of EMI 1 mRNA or protein is greater 
(statistically significantly greater) in the presence of the candidate compound than in its 
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absence, the candidate compound is identified as a stimulator of EMI 1 mRNA or protein 
expression. Alternatively, when expression of EMI I mRNA or protein is less 
(statistically significantly less) in the presence of the candidate compound than in its 
absence, the candidate compound is identified as an inhibitor of EMU mRNA or protein 
5 expression. Ihe level of EMI I mRNA or protein expression in the cells can be 
determined by methods described herein for detecting EMI 1 mRNA or protein. 

In yet another aspect of the invention, the EMU proteins can be used as "bait 
proteins" in a two-hybrid assay (see, e.g., U.S. Patent No. 5,283,317; Zervos et al. 
(1993) Cc>// 72:223-232; Madura et al. ( 1993) J Biol. Chem. 268:12046-12054; Bartel et 
10 al. (1993) Biolcchnicjues 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; 
and Brent WO94/10300). to identify other proteins, which bind to or interact with EMI 1 
("EMU -binding proteins" or "EMU -bp") and modulate EMI I protein activity. Such 
EMU -binding proteins are also likely to be involved in the propagation of signals by the 
EMI 1 proteins as, for example, up.stream or downstream elements of the EMI 1 pathway. 
1 5 The two-hybrid system is based on the modular nature of most transcription 

factors, which consist of separable DNA-binding and activation domains. Briefly, the 
assay utilizes two different DNA constructs. In one construct, the gene that codes for 
EMI 1 is fused to a gene encoding the DNA binding domain of a kjiown transcription 
factor (e.g.. GAL-4). In the other construct, a DNA sequence, from a library of DNA 
20 sequences, that encodes an unidentified protein ("prey" or "sample") is fused to a gene 
that codes for the activation domain of the known transcription factor. If the "bait" and 
the "prey" proteins are able to interact, in vivo, forming an EMI 1 -dependent complex, 
the DNA-binding and activation domains of the transcription factor are brought into 
close proximity. This proximity allows transcription of a reporter gene (e.g., LacZ) 
25 which is operably linked to a transcriptional regulatory site responsive to the 

transcription factor. Expression of the reporter gene can be detected and cell colonies 
containing the functional transcription factor can be isolated and used to obtain the 
cloned gene which encodes the protein which interacts with EMU. 

Modulators of EMI I protein activity and/or EMU nucleic acid expression 
30 identified according to these drug screening assays can be to treat, for example, 
cardiovascular diseases or disorders such as atherosclerosis, ischemia/reperfusion, 
hypertension, and restenosis. Examples of other cardiovascular diseases or disorders 
which can be treated using modulators of EMI 1 protein activity and/or nucleic acid 
expression are described in Robbins, S.L. et al. eds. Pathologic Basis of Disease (W.B. 
35 Saunders Company, Philadelphia, PA 1984) 502-547. These methods of treatment 
include the steps of administering the modulators of EMU protein activity and/or 
nucleic acid expression, e.g.. in a phannaceutical composition as described in subsection 
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IV above, to a subject in need of such treatment, e.g., a subject with cardiovascular 
disease. 

b. Diaunostic Assays : 
5 The invention further provides a method for detecting the presence of EMI 1 in a 

biological sample. The method involves contacting the biological sample with a 
compound or an agent capable of detecting EMIl protein or mRNA such that the 
presence of EMI 1 is detected in the biological sample. A preferred agent for detecting 
EMI 1 mRNA is a labeled or labelable nucleic acid probe capable of hybridizing to EMIl 
10 mRNA. The nucleic acid probe can be, for example, the full-length EMIl cDNA of 
SEQ ID NO: 1 , or a portion thereof, such as an oligonucleotide of at least 15, 30, 50, 
100. 250 or 500 nucleotides in length and sufficient to specifically hybridize under 
stringent conditions to EMIl mRNA. A preferred agent for detecting EMIl protein is a 
labeled or labelable antibody capable of binding to EMI 1 protein. Antibodies can be 

1 5 polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof 
(e.g.. Fab or F(ab*)2) can be used. The term "labeled or labelable", with regard to the 
probe or antibody, is intended to encompass direct labeling of the probe or antibody by 
coupling (i.e., physically linking) a detectable substance to the probe or antibody, as 
well as indirect labeling of the probe or antibody by reactivity with another reagent that 

20 is directly labeled. Examples of indirect labeling include detection of a primary 

antibody using a fluorescently labeled secondary antibody and end-labeling of a DNA' 
probe with biotin such that it can be detected with fluorescently labeled streptavidin. 
The term "biological sample" is intended to include tissues, cells and biological fluids 
isolated from a subject, as well as tissues, cells and fluids present within a subject. That 

25 is, the detection method of the invention can be used to detect EMI I mRNA or protein 
in a biological sample in vitro as well as in vivo. For example, in vitro techniques for 
detection of EMIl mRNA include Northern hybridizations and in situ hybridizations. In 
vitro techniques for detection of EMIl protein include enzyme linked immunosorbent 
assays (ELISAs), Western blots, immunoprecipitations and immunofluorescence. 

30 Alternatively, EMI 1 protein can be detected in vivo in a subject by introducing into the 
subject a labeled anti-EMIl antibody. For example, the antibody can be labeled with a 
radioactive marker whose presence and location in a subject can be detected by standard 
imaging techniques. 

In one preferred embodiment of the detection method, the biological sample is a 
35 endothelial cell sample. The endothelial cell sample can comprise vascular tissue or a 
suspension of endothelial cells. A tissue section, for example, a freeze-dried or fresh 
frozen section of vascular tissue removed from a patient, can be used as the endothelial 
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cell sample. Alternatively, the biological sample can comprise a biological fluid 
obtained from a subject having a cardiovascular disorder. In another preferred 
embodiment of the detection method, the biological sample is an epithelial cell sample 
(e.g.. a .sample which includes gut-derived epithelial cells). A tissue .section, for 
5 example, a freeze-dried or fresh frozen section of epithelial cell-based tumor tissue (e.g., 
carcinoma tissue) removed from a patient, can be used as the epithelial cell sample. 

The invention also encompasses kits for detecting the presence of EMI 1 in a 
biological sample. For example, the kit can comprise a labeled or labelable compound 
or agent capable of detecting EMI 1 protein or mRNA in a biological sample; means for 
0 determining the amount of EMI 1 in the sample; and means for comparing the amount of 
EMI 1 in the sample with a standard. The compound or agent can be packaged in a 
suitable container. The kit can further comprise instructions for using the kit to detect 
EMIl mRNA or protein. 

The methods of the invention can also be used to detect genetic lesions in a 
5 EMI 1 gene, thereby determining if a subject with the tesioned gene is at risk for a 
disorder characterized by aberrant or abnormal EMH nucleic acid expression or EMIl 
protein activity as defined herein. In preferred embodiments, the methods include 
detecting, in a sample of cells from the subject, the presence or absence of a genetic 
lesion characterized by at least one of an alteration affecting the integrity of a gene 
) encoding an EMI 1 protein, or the misexpression of the EMI 1 gene. For example, such 
genetic lesions can be detected by ascertaining the existence of at least one of 1 ) a 
deletion of one or more nucleotides from an EMI 1 gene: 2) an addition of one or more 
nucleotides to an EMI 1 gene; 3) a substitution of one or more nucleotides of an EMIl 
gene, 4) a chromosomal rearrangement of an EMI 1 gene: 5) an alteration in the level of 
J a messenger RNA transcript of an EMI 1 gene. 6) aberrant modification of an EMI 1 
gene, such as of the methylation pattern of the genomic DNA. 7) the presence of a non- 
wild type splicing pattern of a messenger RNA transcript of an EMI 1 gene, 8) a non- 
wild type level of an EMI 1 -protein, 9) allelic loss of an EMIl gene, and 10) 
inappropriate post-translational modification of an EMI 1 -protein. As described herein, 
there are a large number of assay techniques known in the art which can be used for 
detecting lesions in an EMI 1 gene. 

In certain embodiments, detection of the lesion involves the use of a 
probe/primer in a polymerase chain reaction (PCR) (see, e.g. U.S. Patent Nos. 4,683,195 
and 4,683,202), such as anchor PCR or RACE PCR, or, altemadvely, in a ligation chain 
reaction (LCR) (see, e.g., Landegran et al. (1988) Science 241:1077-1080; and 
Nakazawa et al. ( 1 994) PNAS 9 1 :360-364), the latter of which can be particulady useful 
for detecting point mutations in the EMIl -gene (see Abravaya et al. (1995) Nucleic 
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Acids Res .23:675-682). This method can include the steps of collecting a sample of 
cells from a patient, isolating nucleic acid (e.g., genomic, mRNA or both) from the cells 
of the sample, contacting the nucleic acid sample with one or more primers which 
specifically hybridize to an EMI 1 gene under conditions such that hybridization and 
5 amplification of the EMI 1 -gene (if present) occurs, and detecting the presence or 

absence of an amplification product, or detecting the size of the amplification product 
and comparing the length to a control sample. 

In an alleniative embodiment, mutations in an EMI I gene from a sample cell can 
be identified by alterations in restriction enzyme cleavage patterns. For example, 

10 sample and control DNA is isolated, amplified (optionally), digested with one or more 
restriction endonucleascs, and fragment length sizes are determined by gel 
electrophoresis and compared. Differences in fragment length sizes between sample and 
control DNA indicates mutations in the sample DNA. Moreover, the use of sequence 
specific ribozymes (see, for example, U.S. Patent No. 5,498,53 1 ) can be used to score 

15 for the presence of specific mutations by development or loss of a ribozyme cleavage 
site. 

In yet another embodiment, any of a variety of sequencing reactions known in 
the art can be used to directly sequence the EMI 1 gene and detect mutations by 
comparing the sequence of the sample EMU with the corresponding wild-type (control) 

20 sequence. Examples of sequencing reactions include those based on techniques 
developed by Maxim and Gilbert ((1977) PNAS 74:560) or Sanger ((1977) PNAS 
74:5463). A variety of automated sequencing procedures can be utilized when 
performing the diagnostic assays ((1995) Biotechniques 19:448), including sequencing 
by mass spectrometry (see, e.g., PCT International Publication No. WO 94/16101 : 

25 Cohen et al. ( 1 996) Adv. Chromalogr. 36: 1 27- 1 62; and Gri ffin et al. ( 1 993) AppL 
Biochem. BiotcchnoL 38:147-159). 

Other methods for detecting mutations in the EMI 1 gene include methods in 
which protection from cleavage agents is used to detect mismatched bases in RNA/RNA 
or RNA/DNA duplexes (Myers et al. (1985) Science 230: 1242); Cotton et al, (1988) 

30 PNAS 85:4397; Saleeba et al. (1992) Meih. EnzymoL 217:286-295), electrophoretic 
mobility of mutant and wild type nucleic acid is compared (Orita et al. (1989) PNAS 
86:2766; Cotton (1993) Miitat Res 285:125-144; and Hayashi (1992) Genet Anal Tech 
Appl 9:73-79). and movement of mutant or wild-type fragments in polyacrylamide gels 
containing a gradient of denaturant is assayed using denaturing gradient gel 

35 electrophoresis (Myers et al ( 1 985) Nature 3 1 3:495). Examples of other techniques for 
detecting point mutations include, selective oligonucleotide hybridization, selective 
amplification, and selective primer extension. 



wo 98/45467 



PCT/US98/(»7356 



- 52 



10 



c. Methods of Treatment 

Another aspect of the invention pertains to methods for treating a subject, e.g., a 
human, having a disease or disorder characterized by (or associated with) aberrant or 
abnormal EMI 1 nucleic acid expression and/or EMI 1 protein activity. These methods 
include the step of administering an EMU modulator to the subject such that treatment 
occurs. The language "aberrant or abnormal EMI 1 expression" refers to expression of a 
non-wild-type EMU protein or a non-wild-type level of expression of an EMI 1 protein. 
Aberrant orabnomial EMU activity refers to a non-wild-type EMU activity or a non- 
wild-type level of EMI I activity. As the EMI i protein is involved in the TGF-p 
signaling pathway, aberrant or abnormal EMI 1 activity or expression interferes with the 
nomial TGF-p effects on TGF-p responsive cells. Non-limiting examples of disorders 
or diseases characterized by or associated with abnormal or aberrant EMI 1 activity or 
expression include cardiovascular disorders and proliferative disorders (e.g., cancers). 
1 5 Cardiovascular disorders are disorders which detrimentally affect normal cardiovascular 
function. Examples of cardiovascular disorders include atherosclerosis, 
ischemia/reperfusion, hypertension, restenosis, and arterial inflammation. Proliferative 
disorders are disorders which are associated with uncontrolled or undesirable cell 
proliferation. Examples of proliferative disorders include proliferative disorders of 
20 epithelial cells, e.g., proliferative disorders of gut derived cells, e.g., pancreatic cancer 
and colorectal cancer. Additional methods of the invention include methods for treating 
a subject having a disorder characterized by aberrant EMI 1 activity or expression. These 
methods include administering to the subject an EMI 1 modulator such that treatment of 
the subject occurs. The terms "treating" or "treatment", as used herein, refer to reduction 
25 or alleviation of at least one adverse effect or symptom of a disease or disorder, e.g., a 
disease or disorder characterized by or associated with abnormal or aberrant EMI 1 
protein activity or EMI 1 nucleic acid expression. 

As used herein, an EMI I modulator is a molecule which can modulate EMI 1 
nucleic acid expression and/or EMI 1 protein activity. For example, an EMU modulator 
30 can modulate, e.g., upregulate (activate ) or downregulate (suppress), EMI 1 nucleic acid 
expression. In another example, an EMI 1 modulator can modulate (e.g., stimulate or 
inliibit) EMU protein activity. If it is desirable to treat a disease or disorder 
characterized by (or associated with) aberrant or abnormal (non-wild-type) EMI 1 nucleic 
acid expression and/or EMU protein activity by inhibiting EMU nucleic acid 
35 expression, an EMI I modulator can be an anlisense molecule, e.g., a ribozyme, as 

described herein. Examples of antisense molecules which can be used to inhibit EMU 
nucleic acid expression include antisense molecules which are complementary to a 
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portion of the 5* untranslated region of SHQ ID NO: 1 which also includes the start codon 
and antisense molecules which are complementary to a portion of the 3' untranslated 
region of SEQ ID NO: 1 . An example of an antisense molecule which is complementar>' 
to a portion of the 5' untranslated region of SEQ ID NO: 1 and which also includes the 
5 start codon is a nucleic acid molecule which includes nucleotides which are 

coniplementary to nucleotides 58 to 79 of SEQ ID NO: 1 . This antisense molecule has 
the following nucleotide sequence: 5' CGTCGAAGTGCCACTACTATAC 3* (SEQ ID 
NO:5). An example of an antisense molecule which is complementary to a portion of 
the 3' untranslated region of SEQ ID NO: I is a nucleic acid molecule which includes 

10 nucleotides which are complementary to nucleotides 1 102 to 1118 of SEQ ID NO: 1 . 
This antisense molecule has the following sequence: 5' AGTTCCAGAGTCTCAGG 3* 
( SEQ ID NO:6). An additional example of an antisense molecule which is 
complementary to a portion of the 3* untranslated region of SEQ ID NO: 1 is a nucleic 
acid molecule which includes nucleotides which are complementary to nucleotides I 169 

15 to 1 1 88 of SEQ ID NO: I . This antisense molecule has the following sequence: 5' 
ACGTCACTGTGCTATGCTAC 3* (SEQ ID NO:7). An EMU modulator which 
inhibits EMI 1 nucleic acid expression can also be a small molecule or other drug, e.g., a 
small molecule or drug identified using the screening assays described herein, which 
inhibits EMI 1 nucleic acid expression. If it is desirable to treat a disease or disorder 

20 characterized by (or associated with) aberrant or abnormal (non-wild-type) EMI 1 nucleic 
acid expression and/or EMU protein activity by stimulating EMIl nucleic acid 
expression, an EMI I modulator can be, for example, a nucleic acid tnolecule encoding 
EMIl (e.g., a nucleic acid molecule comprising a nucleotide sequence homologous to 
the nucleotide sequence of SEQ ID NO: I ) or a small molecule or other drug, e.g., a 

25 small molecule (peptide) or drug identified using the screening assays described herein, 
which stimulates EMI 1 nucleic acid expression. 

Alternatively, if it is desirable to treat a disease or disorder characterized by (or 
associated with) aberrant or abnormal (non-wild-type) EMIl nucleic acid expression 
and/or EMIl protein activity by inliibiting EMIl protein activity, an EMIl modulator 

30 can be an aiiti-EMI I antibody or a small molecule or other drug, e.g., a small molecule 
or drug identified using the screening assays described herein, which inhibits EMI 1 
protein activity. If it is desirable to treat a disease or disorder characterized by (or 
associated with) aberrant or abnormal (non-wild-type) EMI 1 nucleic acid expression 
and/or EMIl protein activity by stimulating EMIl protein activity, an EMU modulator 

35 can be an active EMIl protein or portion thereof (e.g., an EMI 1 protein or portion 

thereof having an amino acid sequence which is homologous to the amino acid sequence 
of SEQ ID NO:2 or a portion thereoO or a small molecule or other drug, e.g., a small 



wo 98/45467 



PCT/US98/07356 



- 54 - 

molecule or drug identified using the screening assays described herein, which 
stimulates EMU protein activity. 

In addition, a subject having a cardiovascular disorder can be treated according 
to the present invention by administering to the subject an EMI 1 protein or portion or a 
5 nucleic acid encoding an EMIl protein or portion thereof such that treatment occurs. 
Similarly, a subject having a proliferative disorder can be treated according to the 
present invention by administering to the subject an EMI 1 protein or portion thereof or a 
nucleic acid encoding an EMIl protein or portion thereof such that treatment occurs. 
Other aspects of the invention pertain to methods for modulating a cell 
10 associated activity. These methods include contacting the cell with an agent (or a 
composition which includes an effective amount of an agent) which modulates EMIl 
activity or EMIl expression such that a cell associated activity is altered relative to a cell 
associated activity of the cell in the absence of the agent. As used herein, "a cell 
associated activity" refers to a normal or abnormal activity or function of a cell. 
1 5 Examples of cell associated activities include proliferation, migration, differentiation, 
production or secretion of molecules, such as proteins, and cell survival. In a preferred 
embodiment, the cell is a TGF-p responsive cell, e.g., a cell which responds to TGF-p 
signaling through a pathway which involves EMI 1 . Examples of cells which respond to 
TGF-P signaling tlirough a pathway which involves EMIl are endothelial cells and 
20 epithelial cells. The temi "altered" as used herein refers to a change, e.g., an increase or 
decrease, of a cell associated activity. In one embodiment, the agent stimulates EMI 1 
protein activity or EMIl nucleic acid expression. Examples of such stimulatory agents 
include an active EMI 1 protein, a nucleic acid molecule encoding EMI 1 that has been 
introduced into the cell, and a modulatory agent which stimulates EMI I protein activity 
25 or EMI nucleic acid expression and which is identified using the drug screening assays 
described herein. In another embodiment, the agent inhibits EMI 1 protein activity or 
EMIl nucleic acid expression. Examples of such inhibitory i\gents include an antisense 
EMIl nucleic acid molecule, an anti-EMIl antibody, and a modulatory agent which 
inhibits EMIl protein activity or EMI 1 nucleic acid expression and which is identified 
30 using the drug screening assays described herein. These modulatory methods can be 
performed in vitro (e.g., by culturing the cell with the agent) or, alternatively, in vivo 
(e.g., by administering the agent to a subject). In a preferred embodiment, the 
modulatory methods are performed in vivo^ i.e., the cell is present within a subject, e.g., 
a mammal, e.g., a human, and the subject has a disorder or disease characterized by or 
35 associated with abnormal or aberrant EMI I activity or expression. 

A nucleic acid molecule, a protein, an EMI 1 modulator etc. used in the methods 
of treatment can be incorporated into an appropriate phamiaceutical composition 
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described herein and cidministered to the subject through a route which allows the 
molecule, protein, modulator etc. to perform its intended function. E.xamples of routes 
of administration are also described herein under subsection IV 

This invention is further illustrated by the following examples which should not 
5 be construed as limiting. The contents of all references, patent applications, patents, and 
published patent applications cited throughout this application are hereby incorporated 
by reference. 

EXAMPLES 

10 

The following materials and methods were used in the Kxamples: 

Yeast Strains. Media, and Microbiological Techniques 

Yeast strains, E. coli strains, and plasmids used in this work are listed in Tables- 

15 1-3 below. Standard yeast media including synthetic complete medium lacking L- 
leucine, L-tryptophan, and L-histidine were prepared and yeast genetic manipulations 
were performed as described (Shennan ( 1 99 1 ) Meih. Enzyniol. 1 94:3-2 1 ). Yeast 
transformations were performed using standard protocols (Gietz et al. (1992) Nucleic 
Acids Res, 20:1425: Ito et al (1983) J. Bacterial, 153:163-168), Plasmid DNAs were 

20 isolated from yeast strains by a standard method (Hoffman and Winston (1987) Gene 
57:267-272). 

TABLE 1 
E. Coli Strains 



coli 
Strain 


Genotype 


Source or Derivation 


PEB199 


F- ompT hsdSQ (r^- nig-) gal dcmlon 


BL21 Ion (Studier (1991) 

MoL Biol. 219:37-44) 
derivative. 
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TABLE 2 
Yeast Strains 



Yeast 
Strain 


Cvcnotype 


Source or Derivation 


HF7c 


MATa ura3'52 hisJ-JOO /ys2'S(JI adc2-l()l 
n pi -901 leu2-3, 1 1 2 gal4-542 fral80-53H 
L YS2: :GA LI iii ^GAL I tj t t-HISS UR.43 
:GAL4 j /ffff.rv/v ^i'^ yGJ ta ta-Iocz 


r ciiuLier ei ai. ( i ) 
Nucleic Acids Res. 
90- 1 50^-1 SO'; 


Y187 


MATa gal-f galSO his3 trpl-901 acle2-IUl 
WCI3-52 leu2-3.112 mef-URAJ: : CAL^IacZ 


Rni nnrl Fll/^rlo*^ Q \ 
n9951 Melhnd\ Fnv^}ni/\i 
273:331-347. 


TB35 


FIF7c + pMB155 


I 1 c;u lui CAperinicnis 
described herein. 


TB30 


HF7c + pYCHD534b 


''»^]j|-'*it^ciiui> coiieciion. 


TB32 


HF7C + PYCFX011 


Applicants' collection. 


Yeast 
Strain 


Genotype 


oource or Lierivation 


TB29 


HF7c + pYCHX01 


Applicants* collection. 


TBI 9 


HF7c + p53 


Applicants' collection. 


TB17 


HF7c + pGBT9 


Applicants' collection. 


TB39 


HF7c + pMB146 


Prepared for experiments 
described herein. 


TF4 


Y187 + pSV40 


Prepared for experiments 
described herein. 


TB40 


HF7c + pMB147 


Prepared for experiments 
described herein 


TB4I 


HF7c + pMB148 

... 


Prepared for experiments 
described herein 
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TABLES 
Plasmids 



Plasmid 
Name 


Description 


Source or Derivation 


pACTII 


GAL4(768-881) fusion vector 


Bai, C. and Elledge, SJ. 
(1995) Methods Enzymol. 
273:331-347. 


PGBT9 


G AL4( 1 - 1 47) fusion vector marked with 
TRPI and amp'' 


Bartel el al. (1993) Cellular 
Interactions in 
Development 153-159 


pMB155 


coding sequence of the fchd540 gene product 
(MADR7) cloned in-frame into pGBT9 


Prepared for experiments 
described herein. 


pYCHD534 
b 


coding sequence of the fchd534 gene product 
(MADR6) cloned in-frame into pGBT9 


Applicants' collection. 


pYCFZOll 


Drosophila MAD coding sequence cloned in- 
frame into pGBT9 


Applicants' collection 


PYCHZOI 


DPC4 coding sequence cloned in-frame into 
pGBT9 


Applicants' collection. 


p53 


p53 control plasmid 


HybriZAP Two-Hybrid 
Vector Kit (Stratagene, 
LaJolla, CA) 


pSV40 


SV40 control plasmid 


HybriZAP Two-Hybrid 
Vector Kit (Stratagene, 
LaJolla, CA) 


PGEX-5X-2 


GST gene fusion vector 


Pharmacia Biotech, Inc. 
(Piscataway, NJ) 


pMBI40B 


EMU 138-335 cloned in pGEX-5x-2 


Prepared for experiments 
described herein. 
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TABLE 3 (cont'd) 
Plasmids 



Plasinid 
Name 


Description 


Source or Derivation 


pMB146 


PY motif of the fchd540 gene product 
(MADR7) cloned in pGBTQ 


Prepared for experiments 
described herein. 


pNSepsilon- 
534 

(MADR6)- 
myc 


coding sequence oFthe fchd534 gene product 
(MADR6) in pN8epsilon-myc 


Prepared for experiments 
described herein. 


pNSepsilon- 
540 

(MADR7)- 
myc 


coding sequence of the fchd540 gene product 
(MADR7) in pN8epsilon-myc 


Prepared for experiments 
described herein. 


pNSepsilon- 
EMII-HA 


EMU coding sequence in pN8epsilon-myc 


Prepared for experiments 
described herein. 


pNSepsilon- 
myc 


CMV promoter-driven mammalian 
expression vector that fuses three copies of 
the myc epitope tag to test proteins 


Applicants' collection 


pNSepsilon- 
HA 


CMV promoter-driven mammalian 
expression vector that fuses three copies of 
the HA epitope tag to test proteins 


Applicants' collection 


pMB147 


P-^A mutant of the PY domain of the 
fchd540 gene product (MADR7) 


Prepared for experiments 
described herein. 


pMB148 


Y->A mutant of the PY domain of the 
fchd540 gene product (MADR7) 


Prepared for experiments 
described herein. 



5 Piasmid and Yeast Strain C onstruction 

The coding region of the human fchd540 gene product (also known as MADR7) 

was amplified by PCR and cloned in-frame into pGBT9 resulting in piasmid pMB155. 

pMB155 was transformed into two-hybrid screening strain HF7c, and one resulting 

transformant was designated TB35. 
0 The coding region of the human fchd534 gene product (also known as MADR6) 

was amplified by PCR and cloned in-frame into pGBT9 resulting in piasmid 

pYCHD534b. pYCHD534b was transformed into two-hybrid screening strain HF7c. 

and one resulting transformant was designated 'rB3(). 

I he coding region of the Drosophila MAD gene (Sekelsky et al. (1995) Genetics 
5 139: 1347-1358.) was amplified by PCR and cloned in-frame into pGBT9 resulting in 

piasmid pYCFXOl 1. pYCFXOI 1 was transformed into two-hybrid screening strain 

HF7c, and one resulting transformant was designated TB32. 
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The coding region of ihc DPC4 gene (Hahn et al. ( 1 996) Sc ience 27 1 :350-353.,) 
was amplified by PGR and cloned in-Frame into pGBT9 resulling in plasmid pYCHXOl. 
pYCHXOl was transformed into two-hybrid screening strain HF7c, and one resulting 
transformant wcxs designated TB29. 
5 DNA encoding amino acids 138-335 of EMI 1 was amplified by PGR and cloned 

in-frame into pGEX-5X-2 resulting in plasmid pMB140B. 

Complementary oligonucleotides encoding the 16 amino acid PY motif of the 
fchd540 gene product (MADR7) (RLGELESPPPPYSRYP (SEQ ID NO:8)) were 
synthesized, annealed, and cloned into pGBT9 resulting in plasmid pMB146. 
10 The coding region of the human fchd534 gene (MADR6) was amplified by PGR 

and cloned in-frame into pN8epsilon-myc resulting in plasmid pN8epsilon-fchd534 
gene-myc. 

The coding region of the human fchd540 gene (MADR7) was amplified by PGR 
and cloned in-frame into pN8epsilon-myc resulting in plasmid pN8cpsilon-fchd54() 
15 gene-myc. 

The coding region of human EMU was amplified by PGR and cloned in-frame 
into pN8epsilon-HA resulting in plasmid pN8epsilon-EMIl-HA. 

Complementary oligonucleotides encoding the 16 amino acid PY motif of the 
fchd540 gene product (MADR7) with the proline at position 10 mutated to alanine 
20 (RLCELESPPAPYSRYP (SEQ ID NO:9)) were synthesized, annealed, and cloned into 
pGBT9 resulting in plasmid pMB147. 

Complementary oligonucleotides encoding the 16 amino acid PY motif of the 
fchd540 gene product (MADR7) with the tyrosine at position 12 mutated to alanine 
(RLCELESPPPPASRYP (SEQ ID NO: 10)) were synthesized, annealed, and cloned into 
25 pGBT9 resulting in plasmid pMB 1 48. 



Two-Hvbrid Screening 

Two-hybrid screening was carried out essentially as described (Bartel et al. 
(1993) Cellular Interactions in Development 153-159) using MYl 14 as the recipient 
30 strain and a human breast two-hybrid library constructed in the lambda AGT II vector. 

Beta Galactosidase Assays 

The filter disk beta-galactosidase (beta-gal) assay was performed essentially as 
previously described (Brill et al. (1994) MoL Biol. Cell, 5:297-312). Briefly, strains to 
35 be tested were grown as patches of cells on appropriate medium dictated by the 

experiment at 3CG overnight. The patches or colonies of cells were replica plated to 
Whatman #50 paper disks (#576 from Schleicher & SchuelL Keene, NH) that had been 
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placed on the test medium in pctri dishes. After growth overnight at 30°C, the paper 
disks were removed from the plates and the cells on them were permeabilized by 
immediate immersion in liquid nitrogen for 30 seconds. After this treatment, the paper 
disks were thawed at room temperature for 20 seconds and then placed in petri dishes 
5 that contained a disk of Whatman #3 paper (#593 from Schleicher & SchuelU Keene, 
NH) saturated with 2.5 ml of Z buffer containing 37 |.il of 2% weight per volume of the 
chromogenic beta-gal substrate X-gal. The permeabilized strains on the paper disks 
were incubated at 30°C and inspected at timed intervals for the blue color diagnostic of 
beta-gal activity in this assay. The assay was stopped by removing the paper disk 
10 containing the patches of cells and air drying it. 



Expression and Purification of Recombinant EMU Protein 

An overnight culture of £. toll strain PEB 199 carrying the pMB140B EMI 1 
GST- fusion plasmid was grown overnight in TB 100 |ig/ml ampicillin medium. The 

15 following day the culture was diluted 1:10 in fresh TB 100 j-tg/ml ampicillin medium 
and grown to an OD^oo of 0.6-0.8. IPTG was added to the culture to a final 
concentration of 0.5 - 1 .0 mM and the culture was then incubated for 3-4 hours at 37°C. 
The culture was pelleted and stored frozen (-80°C) for 1 day. The culture was thawed 
and resuspended in 20-50 ml of PBS and passed through a French press 2-3 times at 

20 20,000 psi. Disruption was monitored by taking OD^oo readings of the lysate. The 

lysate was centrifuged for 30 minutes at 15.000 x g and the supernatant was decanted to 
a fresh tube. Glutathione Sepharose 4B resin (Pharmacia Biotech, Inc., Piscataway, NJ) 
was washed with 5-10 column volumes of PBS to remove resin storage buffer. The 
supernatant was added to the washed resin. The resulting slurry was added to a 50 ml 
25 conical tube and batch binding was allowed to proceed for one hour. The slurry was 
washed twice with 10 column volumes of PBS and then the recombinant protein was 
eluted with a 50 mM tris-HC 1 pH 8.0 buffer containing 50 mM reduced glutathione. 
Eluted proteins were analyzed by electrophoresis on a 14% tris glycine SDS 
polyacrylamide gel (Novex, San Diego, CA) and subsequent Coomassie staining. 

30 

Coimnumoprecipitation Analysis 

Primary bovine aortic endothelial cells (BAECs) were transfected with 2 |.ig of 
pN8epsilon-fchd534 gene (MADR6)-myc or pN8epsilon-fchd540 gene (MADR7)-myc 
and 10 /.igs of pNScpsilon-EMIl-HA using the calcium phosphate method. pNSepsilon- 
35 myc is a plasmid derived from pCI (Promega, Madison, WI) with the CMV promoter 
and three myc peptide encoding sequences such that when a cDNA is inserted, three 
inframe mycs arc added to the amino terminus of the expressed protein. pN8epsilon-HA 
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is identical to pN8epsilon-myc except that it contains three copies of the HA epitope tag 
instead of the myc epitope tag. 

Forty-eight hours after transfection, cells were removed from the plates by 
scraping, washed with PBS, and pelleted. This pellet was resuspended in 100 of lysis 
5 buffer (20mM HEPES, pH 7.5, 0.3M NaCl, l.SmM MgCh. 0.2mM EDTA, 0.1% Triton 
XlOO) and allowed to incubate on ice for 20 minutes. Lysed cells were then spun for 15 
minutes in an Eppendorf centrifuge and the resulting supernatant was added to 300 |.il of 
equilibration buffer (20mM HEPES, 2.5mM MgCb, ImM EDTA). lf.tg of mouse 
monoclonal antibody against HA (Boehringer Mannheim, Indianapolis IN) was added 

10 with 20 f.il of protein G agarose and incubated overnight with shaking at 4X. The tube 
was then spun and the supernatant was removed leaving the agarose beads. Beads were 
washed twice with wash buffer (20mM HEPES, .05 M NaCl, 2.5mM MgCl, ImM 
EDTA, .05% Triton XlOO) twice with Tris/LiCl buffer (lOOmM Tris, 500mM LiCl) and 
then twice again with wash buffer. Wash buffer was removed and 20 |.il of protein 

15 loading buffer was added. The tubes were heated at lOO^C for 5 minutes and 15 ).il was 
loaded on a 10% PAGE gel (BioRad, Cambridge, MA) and electrophoresed. Following 
electrophoresis, the gel was transferred to nitrocellulose, and Western blotting was 
carried out using peroxidase conjugated mouse monoclonal anti-myc antibody (1 :2000 
dilution) (Boehringer Mannheim, Indianapolis IN). The blot was visualized using the 
20 ECL system. 

EXAMPLE I: IDENTIFICATION OF EMIl CDNA 

In this example, a yeast two-hybrid assay was performed in which a plasmid 
25 containing a GAL4 DNA-binding domain-fchd540 gene fusion was introduced into the 
yeast two-hybrid screening strain HF7c described above. HF7c was then transformed 
with the human breast two-hybrid library. Five million transfonnants were obtained and 
plated in selection medium. Colonies that grew in the selection medium and expressed 
the beta-galactosidase reporter gene were further characterized and subjected to 
30 retransformation and specificity assays. The retransformation and specificity tests 
yielded one clone, EMI 1 , which was able to bind to selected MADR proteins. 

The fchd540 gene coding sequence was amplified by PCR and cloned into 
pGBT9 creating a GAL4 DNA-binding domain-fchd540 gene fusion (plasmid 
pMB155). HF7c was transfonned with this construct resulting in strain rB35. TB35 
35 grew on synthetic complete medium lacking L-tryptophan but not on synthetic complete 
medium lacking L-tryptophan and L-histidine demonstrating that the GAL4 DNA- 
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binding domain-fchd340 gene fusion docs not have intrinsic transcriptional activation 
activity. 

TB35 was transformed with the human breast two-hybrid library and 5 million 
transformants were obtained. The transformants were plated on synthetic complete 
5 medium lacking L-leucine, L-tryptophan, and L-histidine. Yeast colonies that grew on 
synthetic complete medium lacking L-leucine, L-tryptophan, and L-histidine as well as 
expressed the beta-galactosidase reporter gene were identified. The 30 strains with the 
strongest beta-galactosidase induction were characterized. Library plasmids were 
isolated from the 30 strains, and the 5' ends of all of the cDNA inserts were sequenced. 
10 This sequencing revealed that one cDNA had been identified twice and the other 28 
cDNAs had been identified once. It is possible that some of the 28 cDNAs that appear 
to be unique are in fact portions of the same gene but, because of different fusion 
junctions to the vector, their sequences do not align with each other. 

The 29 potentially unique cDNAs were subjected to retransformation and 
15 specificity tests. It was determined, using the yeast two-hybrid system, whether each 
library cDNA-encoded protein could physically interact with a panel of bait proteins 
which included the fchd540 gene product (MADR7), the fchd534 gene product 
(MADR6), the Drosophila MAD gene product, the DPC4 gene product, the p53 gene 
product, and the GAL4 DNA-binding domain. Yeast expression plasmids described 
20 above, which encode the GAL4 DNA-binding domain either alone or fused in-frame to 
the fchd540 gene (MADR7), the fchd534 gene (MADR6). the Drosophila MAD gene, 
the DPC4 gene, and p53 gene, were transformed into MATa two-hybrid screening strain 
HF7c. Yeast expression plasmids encoding GAL4 activation domain fusions to the 29 
cDNAs and SV40 were transformed into MATa two-hybrid screening strain Y187. p53 
25 and SV40 interact with each other and should not interact with the experimental 

proteins. The HF7c transformants were propagated as stripes on semisolid synthetic 
complete medium lacking L-tryptophan and the Yl 87 transformants were grown as 
stripes on semisolid synthetic complete medium lacking L-leucine. Both sets of stripes 
were replica plated in the form of a grid onto a single rich YPAD plate and the haploid 
30 strains of opposite mating types were allowed to mate overnight at 30° C\ The yeast 

strains on the mating plate were then replica plated to a synthetic complete plate lacking 
L-leucine and L-tryptophan to select for diploids and incubated at 30'' C overnight. 
Diploid strains on the synthetic complete plate lacking L-leucine and L-tryptophan were 
replica plated to a synthetic complete plate lacking L-leucine, L-tryptophan, and L- 
35 histidine to assay HISS expression and a paper filter on a synthetic complete plate 

lacking L-leucine and L-tryptophan. The next day, the paper filter was subjected to the 
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paper filter bela-galactosidase assay to measure expression of the lacZ reporter gene. 
//AVJ expression was seored after 3 days of growth at 30°C. 

One clone, EMI I encoded a polypeptide that interacted strongly with the 
fchd540 gene product (MADR7), the fchd534 gene product (MADR6), and the 
5 Drosophila MAD gene product but did not interact with other baits in the panel. The 
results of the retransformation and specificity test performed on EMIl are summarized 
i able 4. The strength or absence of physical interaction between each combination of 
test proteins is listed. Strong interactions are defined as interactions that cause the 
activation of both the HISS and lacZ reporter genes. 

10 

TABLE 4 

Summary of EMIl Retransformation and Specificity Assays 



CDNA-GAL4 Activation Domain Fusion Tested 



GAL4 DNA-Binding 

Oomain Fusions 


EMIl 


SV40 


fchd540 gene product 
(MADR7) 


strong 


none 


fchd534 gene product 
(MADR6) 


strong 


none 


Drosophila MAD gene 
product 


strong 


none 


DPC4 gene product 


none 


none 


p53 gene product 


none 


strong 


GAL4 binding domain 
alone 


none 


none 


PY motif of the fchd540 
gene product (MADR7) 


strong 


none 


PI 0-^A 10 mutant PY 
motif of the fchd540 gene 
product (MAbR7) 


none 


none 


Y12^A 12 mutant PY 
motif of the fchd540 gene 
product (MADR7) 


none 


none 
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Specillc binding oFthe EMU gene product to three distinct MADR proteins 
(MADR7, MADR6, Drosuphila MAD gene product) indicated that EMI I is involved in 
a signaling pathway which involves an MADR protein. The complete DNA sequence of 
the EMU cDNA insert was determined using standard techniques. In brief, using a 
5 standard PCR strategy, the 5' missing portion of the EMI 1 clone was amplified out of 
the human breast library. 1 he 5' end of EMI 1 was spliced onto the 3' end of EMI 1 to 
create a full length EMI 1 clone, the sequence of which was then detemiined and 
analyzed. All sequencing was performed by standard automated fluorescent 
dideoxynucleotide sequencing using dye primer chemistry (Applied Biosystems, Inc.. 
1 0 Foster City. CA) on Applied Bio.systems 373 and 377 sequenators. The DNA sequences 
were screened to eliminate bacterial, ribosomal. and mitochondrial contaminants. 
Sequence artifacts were also eliminated, such as vectors and repetitive elements. 

EXAMPLE 2: EXPRESSION OF RECOMBINANT EMU PROTEIN 

15 IN 

BACTERIAL CELLS 

In this example. EMU was expressed as a recombinant glutathione-S-transferase 
(GST) fusion protein in £ coli and the fusion protein was isolated and characterized. 

20 Specifically, as described above. EMI 1 was fused to GST and this fusion protein was 
expressed in E. colt strain PEB199. As EMIl was predicted to be 21 kD and GST was 
predicted to be 26 kD, the fusion protein was predicted to be 47 kD in molecular weight. 
Expression of the GST-EMIl fusion protein in FEB 199 was induced with IPTG. The 
recombinant fusion protein was purified from crude bacterial lysates of the induced 

25 FEB 1 99 strain by affinity chromatography on glutathione beads. Using polyacrylamide 
gel electrophoretic analysis of the proteins purified from the bacterial lysates, the 
resultant fusion protein was determined to be 47 kD in size. 

EXAMPLE 3: EXPRESSION OF RECOMBINANT EMIl PROTEIN IN 

30 COS CELLS 

To express the EMI 1 gene in COS cells, the pcDNA/Amp vector by Invitrogen 
Corporation f San Diego, CA) is used. This vector contains an SV40 origin of 
replication, an ampicillin resistance gene, an E. coli replication origin, a CMV promoter 
35 followed by a polylinkcr region, and an SV40 intron and polyadenylation site. A DNA 
fragment encoding the entire EMIl protein and a HA tag (Wilson et al. (1984) Cell 
37:767) fused in-frame to its 3' end of the fragment is cloned into the poly linker region 
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oFthe vector, thereby placing the expression of the recombinant protein under the 
control of* the CMV promoter. 

To construct the plasmid, the EMIl DNA sequence is amplitled by PGR using 
two primers. The 5' primer contains the restriction site of interest followed by 
5 approximately twenty nucleotides of the EMI 1 coding sequence starting from the 
initiation codon; the 3' end sequence contains complementary sequences to the other 
restriction site of interest, a translation stop codon, the HA tag and the last 20 
nucleotides of the EMI 1 coding sequence. The PGR amplified fragment and the 
pGDNA/Amp vector are digested with the appropriate restriction enzymes and the 

10 vector is dephosphorylated using the GIAP enzyme (New England Biolabs, Beverly, 
MA). Preferably the two restriction sites chosen are different so that the EMIl gene is 
inserted in the correct orientation. The ligation mixture is transformed into E. col'i cells 
(strains HBIOL DH5a, SURE, available from Stratagene Gloning Systems, La Jolla, 
GA, can be used), the transfonned culture is plated on ampicillin media plates, and 

15 resistant colonies are selected. Plasmid DNA is isolated from transformants and 
examined by restriction analysis for the presence of the correct fragment. 

GOS cells are subsequently transfected with the EMI 1 -pcDNA/Amp plasmid 
DNA using the calcium phosphate or calcium chloride co-precipitation methods, DEAE- 
dextran-mediated transfection, lipofection, or electroporation. Other suitable methods 

20 for transfecting host cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T. ^ 
Molecular Cloning: A Laboratory Manual, 2ncl eci. Cold Spring Harbor Laboratory. 
Gold Spring Harbor Laboratory Press, Gold Spring Harbor, NY, 1989. The expression 
of the EMIl protein is detected by radiolabelling (^^S-methionine or 35s-cysteine 
available from NEN, Boston, MA, can be used) and immunoprecipitation 

25 (Harlow, E. and Lane, D. Antihodies: A Laboratory Manual, Gold Spring Harbor 
Laboratory Press, Cold Spring Harbor, NY, 1988) using an HA specific monoclonal 
antibody. Briefly, the cells are labelled for 8 hours with ^^S-methionine (or ^^S- 
cysteine). The culture media are then collected and the cells are lysed using detergents 
(RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1% SDS, 0.5% DOG, 50 mM Tris, pH 7.5). 

30 Both the cell lysate and the culture media are precipitated with an HA specific 
monoclonal antibody. Precipitated proteins are then analyzed by SDS-PAGE. 

Alternatively, DNA containing the EMIl coding sequence is cloned directly into 
the polylinker of the pGDN A/Amp vector using the appropriate restriction sites. The 
resulting plasmid is transfected into COS cells in the manner described above, and the 

35 expression of the EMIl protein is detected by radiolabelling and immunoprecipitation 
using an EMIl specific monoclonal antibody 



wo 98/45467 



PCT/US98/07356 



-66- 

EXAMPLE 4: CHARACTERIZATION OF EMU PROTEIN 

In this example, the amino acid sequence of the EMU protein was compared to 
amino acid sequences of known proteins and various motifs were identified. In addition, 
5 using two hybrid screening assays, the ability of the EMI 1 protein to interact with a 
panel of MADR proteins was analyzed. 

The EMIl protein, the amino acid sequence of which is shown in Figure 1 (SEQ 
ID NO:2X is a novel protein which includes 335 amino acid residues. At its carboxyl 
terminus (amino acid residues 300-335), the EMIl protein includes a 36 amino acid 
1 0 WW domain. WW domains have been reported to comprise a motif of approximately 
38 amino acid residues, one of the prominent features of which is the presence of two 
conserved tryptophans (W) (Sudol et al. (1995 ) FEBS Letters 369:67-71). A WW 
domain consensus sequence can be found in the EMIl protein depicted in SEQ ID NO:2 
from amino acid residues 300 to 335 and in SEQ ID NO:4. 
1 5 The EMI 1 WW domain is most similar to the WW domains found in several 

ubiquitin protein ligases including mammalian NEDD4 (Staub et al. (1996) EMBO J, 
15:2371-2380) and yeast RSP5 (GenBank™ Accession Number U18916:36076-38595). 
The highest similarity is 21/36 amino acid identities. However, EMIl does not contain a 
hect domain, the catalytic site of ubiquitin protein ligases (Huibregtse et al. (1995) 
20 PNAS 92:2503-2507), suggesting that EMI 1 is not a ubiquitin protein ligase. EMI 1 may 
regulate protein stability by competing with ubiquitin protein ligases for PY domains, a 
WW consensus binding domain described below. This is tested by determining if WW 
domains present in ubiquitin protein ligases bind to the same PY motifs as the WW 
domain in EMIl . 

25 The consensus sequence bound by the WW domain has been identified and 

designated as the PY motif (Chen and Sudol (1995) PNAS 92:7819-7823). The PY 
motif includes a proline-rich domain followed by a tyrosine residue. The particular PY 
motifs to which the WW domain binds include the following amino acid sequence: 
XPPXY wherein X can be any amino acid residue. Proteins known to include PY motifs 

30 include several members of the MADR family of proteins at least some members of 
which have been characterized as being effectors of the TGF(3 response in cells. 
Examples of members of the MADR proteins are described herein. The PY motifs of 
some MADR proteins are shown below in Table 5: 
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TABLE 5 
PY Motifs ot Some MADR Proteins 



MAD Protein 
(amino acid 

residues) 


PY Consensus: 
XPPXY 


Source 

Chen and Sudol 
( I vy^) rNAS 
92:7819-7823. 


rchd540 gene 
product (MAUK/j 
(aa 200 to aa215) 


RLCELESPPPPYSRYP (SEQ ID NO:8) 


United States Serial 
No. 0o/7v9,9 i 0, 
filed February 13. 
1997. 


fchd534 gene 
product (IvlAUKo) 
(aa 7 to aa 22) 


P I ETOKS PPP P YSRLS (SEQ ID NO: 11) 


United States Serial 
No. 08/599,6^4, 
filed February 9. 
1996. 


liMAD-l 

(aa216 toaa231) 


FQMPADTPPPAYIjPPE (ohQ lU N(J:.12) 


Zhang, Y. ct al. 
(1996) Nature 
383:168-172. 


MAD Protein 
(amino acid 
residues) 


PY Consensus: 
XPPXY 


Source 

Chen and Sudol 
(1995) PNAS 
92:7819-7823. 


hMAD-2 

(aa2l4 toaa229) 


SNYIPETPPPGYISED (SEQ ID NO:13) 


Zhang. Y. et al. 

(I996)M//i//e 

383:168-172. 


niviAU-j 

(aa 172 to aa 187) 


UoJN 1 Phi 1 PirFCj X J-ior^U \ otv^ 11^ i^^-^- ) 


z^^nang, i . ci ai. 
(1996) Nature 
383:168-172. 


Smad5 

(aa215 to aa 230) 


FOLPADTPPPAYMPPD {SEQIDNO:15) 


Riggins. G.J. et al. 
(1996) M/A 
Genetics 13:347- 
349 


Drosophila MAD 
(aa214 to aa 229) 


YDSLAGTPPPAYSPSE (SEQIDNO:l6) 


Sekelsky, J.J. et al. 
(1995) Genetics 
139:1347-1358. 



5 To confirm that the PY domain of the fchd540 gene product (MADR7) was the 

region of the fchd540 gene product (MADR7) that interacts with EMI I, two 
complementary oligonucleotides encoding the 16 amino acid PY domain of the fchd540 
gene product (MADR7) (RLCELESPPPPYSRYP (SEQ ID NO:8)) were synthesized, 
annealed to each other, and cloned into the OAL4 DNA-binding domain fusion vector 
10 pGBT9 to create an fchd540 gene product (MADR7) PY bait construct. This construct 
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was introduced into (he two-hybrid screening strain HF7c resulting in strain TB39. 
Strain TB39 was added to the specificity testing panel described above in Example 1. 
The results of" this specificity testing revealed that BMIl interacted equally strongly with 
the full length 426 amino acid fchd540 gene product (MADR7) protein bait as with the 
5 1 6 amino acid PY domain of the fchd540 gene product (IVIADR7) bait. This result 
establishes that EMU interacts specifically with the PY domain of the rchd540 gene 
product (MADR7). 

PY domain baits (16 amino acids in length) which express mutant derivatives of 
the fchd540 gene product (MADR7) PY domain were then constructed. Plasmid 
1 0 pMB 1 47 encodes the 1 6 amino acid P Y domain of the fchd540 gene product (MADR7) 
in which the proline at position 10 is mutated to alanine. Plasmid pMBI48 encodes the 
16 amino acid PY domain of the fchd540 gene product (MADR7) in which the tyrosine 
at position 12 is mutated to alanine. Analogous mutations in other PY domains have 
been shown to abolish specific binding of PY domains to their cognate WW domains 
15 (Chen and Sudol (1995) />A^//6' 92:78 19-7823). pMB147 and pMBI48 were introduced 
into HF7c by transformation creating TB40 and TB41. respectively. Western blotting 
confirmed that the tiansformants expressed both of the mutant PY domain baits. TB4() 
and TB4 1 were added to the specificity testing panel described above in Example 1 . 
Specificity testing with TB40 and TB41 revealed that both the PI0->.A10 and Y12-> 
20 A 12 mutations abolished binding of the PY domain of the fchd540 gene product 

(MADR7) to EMI 1 (Table 4). These results demonstrate that the 16 amino acid PY 
motif of the fchd540 gene product (MADR7) binds strongly to EMU and that two 
different amino acid substitutions known to prevent specific PY domain binding to WW 
domains block binding of the PY domain of the fchd540 gene product (MADR7) to 
25 EMI 1 . Taken together, these results show that the P Y domain of the fchd540 gene 

product (MADR7) binds strongly and specifically to EMU and that EMU is, therefore, a 
regulator of fchd540 gene product (MADR7) activity. 

To determine if EMI 1 associates with the fchd534 gene product (MADR6) and 
the fchd540 gene product (MADR7) in endothelial cells, coimmunoprecipitation .studies 
30 were performed. Primary BAECs were cotransfected with pN8epsilon-fchd534 

(MADR6)-myc and pN8epsiion-EMII-HA or pN8epsilon-fchd540 (MADR7)-myc and 
pNSepsilon-EMIl-HA. Anti-HA antibodies were used in the immunoprecipitation step 
and proteins that were precipitated by the antibodies were elcctrophoresed. blotted, and 
probed with anti-myc antibodies in a Western blotting experiment. The results of the 
35 Western blotting experiment showed that both the fchd534 gene product (MADR6) and 
the fchd540 gene product (MADR7) coimmunoprecipitated with EMU. 
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Equivalents 

Those skilled in the art will recognize, or be able to ascertain using no more than 
routine experimentation, many equivalents to the specific embodiments of the invention 
described herein. Such equivalents arc intended to be encompassed by the following 
5 claims. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

5 

(i) APPLICANT: 

(A) NAME: MILLENNIUM PHARMACEUTICALS, INC. 

(B) STREET: 640 MEMORIAL DRIVE 

(C) CITY: CAMBRIDGE 

(D) STATE: MASSACHUSETTS 

(E) COUNTRY: US 

(F) POSTAL CODE (ZIP) : 02139 

(G) TELEPHONE: 

(H) TELEFAX: 

15 

(ii) TITLE OF INVENTION: NOVEL TGF-P PATHWAY GENES 
(iii) NUMBER OF SEQUENCES: 16 

20 (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: LAHIVE & COCKFIELD, LLP 

(B) STREET: 28 STATE STREET 

(C) CITY: BOSTON 

(D) STATE: MASSACHUSETTS 
25 (E) COUNTRY: US 

(F) ZIP: 02109-1875 



30 



40 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: PatentIn Release #1.0, Version #1,25 



(vi) CURRENT APPLICATION DATA: 
35 (A) APPLICATION NUMBER: PCT/US98/ 

(B) FILING DATE: 10 APRIL 19 98 

(C) CLASSIFICATION: 



(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: USSN 08/844,312 

(B) FILING DATE: 10 APRIL 1997 



(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: HANLEY, ELIZABETH A. 
45 (B) REGISTRATION NUMBER: 33,505 

(C) REFERENCE/DOCKET NUMBER: MNI-013PC 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: (617)227-7400 
50 (B) TELEFAX: (617)742-4214 
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(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1290 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 77.. 1081 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

TGCGGGCGGT GGAAGGCGGA AGTAGGAGAG GAGTTCGGCG CCGCTTCTGT GGCCACGGCA 6 0 

GCTTCACGGT GATGAT ATG GCA TCT GCC AGC TCT AGC CGG GCA GGA GTG 10 9 

20 Met Ala Ser Ala Ser Ser Ser Arg Ala Gly Val 

15 10 

GCC CTG CCT TTT GAG AAG TCT CAG CTC ACT TTG AAA GTG GTG TCC GCA 157 
Ala Leu Pro Phe Glu Lys Ser Gin Leu Thr Leu Lys Val Val Ser Ala 
25 15 20 25 

AAG CCC AAG GTG CAT AAT CGT CAA CCG CGA ATT AAC TCC TAC GTG GAG 20 5 

Lys Pro Lys Val His Asn Arg Gin Pro Arg lie Asn Ser Tyr Val Glu 

30 35 40 

30 

GTG GCG GTG GAT GGA CTC CCC AGT GAG ACC AAG AAG ACT GGG AAG CGC 2 53 

Val Ala Val Asp Gly Leu Pro Ser Glu Thr Lys Lys Thr Gly Lys Arg 

45 50 55 

35 ATT GGG AGC TCT GAG CTT CTC TGG AAT GAG ATC ATC ATT TTG AAT GTT 301 
lie Gly Ser Ser Glu Leu Leu Trp Asn Glu lie lie lie Leu Asn Val 
60 65 70 75 

ACG GCA CAG AGT CAT TTA GAT TTA AAG GTC TGG AGC TGC CAT ACC TTG 34 9 

40 Thr Ala Gin Ser His Leu Asp Leu Lys Val Trp Ser Cys His Thr Leu 

80 85 90 

AGA AAT GAA CTG CTA GGC ACC GCA TCT GTC AAC CTC TCC AAC GTC TTG 3 97 

Arg Asn Glu Leu Leu Gly Thr Ala Ser Val Asn Leu Ser Asn Val Leu 
45 95 100 105 

AAG AAC AAT GGG GGC AAA ATG GAG AAC ATG CAG CTG ACC CTG AAC CTG 44 5 

Lys Asn Asn Gly Gly Lys Met Glu Asn Met Gin Leu Thr Leu Asn Leu 
110 115 120 



50 



CAG ACG GAG AAC AAA GGC AGC GTT GTC TCA GGC GGA GAG CTG AC A ATT 4 93 

Gin Thr Glu Asn Lys Gly Ser Val Val Ser Gly Gly Glu Leu Thr lie 
125 130 135 
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10 



TTC CTG GAC GGG CCA ACT GTT GAT CTG GGA AAT GTG CCT AAT GGC AGT 
Phe Leu Asp Gly Pro Thr Val Asp Leu Gly Asn Val Pro Asn Gly Ser 
140 145 150 

GCC CTG ACA GAT GGA TCA CAG CTG CCT TCG AGA GAC TCC AGT GGA ACA 
Ala Leu Thr Asp Gly Ser Gin Leu Pro Ser Arg Asp Ser Ser Gly Thr 
160 165 

GCA GTA GCT CCA GAG AAC CGG CAC CAG CCC CCC AGC ACA AAC TGC TTT 
Ala Val Ala Pro Glu Asn Arg His Gin Pro Pro Ser Thr Asn Cys Phe 
175 180 185 



GGT GGA AGA TCC CGG ACG CAC AGA CAT TCG GGT GCT TCA GCC AGA ACA 
Gly Gly Arg Ser Arg Thr His Arg His Ser Gly Ala Ser Ala Arg Thr 
15 190 195 200 

ACC CCA GCA ACC GGC GAG CAA AGC CCC GGT GCT CGG AGC CGG CAC CGC 
Thr Pro Ala Thr Gly Glu Gin Ser Pro Gly Ala Arg Ser Arg His Arg 
205 210 215 



20 



25 



30 



35 



40 



45 



GTC TAT TAT GTT GAC CAC AAT ACC AAG ACC ACC ACC TGG GAG CGG CCC 
Val Tyr Tyr Val Asp His Asn Thr Lys Thr Thr Thr Trp Glu Arg Pro 
320 325 330 

CTT CCT CCA GGG TAGGTCATCA ACTGAGAAGA CCTGAGACTC TGGAACTGAC 
50 Leu Pro Pro Gly 

335 



541 



589 



637 



685 



733 



CAG CCC GTC AAG AAC TCA GGC CAC AGT GGC TTG GCC AAT GGC ACA GTG 
Gin Pro Val Lys Asn Ser Gly His Ser Gly Leu Ala Asn Gly Thr Val 
220 225 230 235 

AAT GAT GAA CCC ACA ACA GCC ACT GAT CCC GAA GAA CCT TCC GTT GTT 
Asn Asp Glu Pro Thr Thr Ala Thr Asp Pro Glu Glu Pro Ser Val Val 
240 245 250 

GGT GTG ACG TCC CCA CCT GCT GCA CCC TTG AGT GTG ACC CCG AAT CCC 
Gly Val Thr Ser Pro Pro Ala Ala Pro Leu Ser Val Thr Pro Asn Pro 
255 260 265 

AAC ACG ACT TCT CTC CCT GCC CCA GCC ACA CCG GCT GAA GGA GAG GAA 
Asn Thr Thr Ser Leu Pro Ala Pro Ala Thr Pro Ala Glu Gly Glu Glu 
270 275 280 

CCC AGC ACT TCG GGT ACA CAG CAG CTC CCA GCG GCT GCC CAG GCC CCC 
Pro Ser Thr Ser Gly Thr Gin Gin Leu Pro Ala Ala Ala Gin Ala Pro 
285 290 295 

GAC GCT CTG CCT GCT GGA TGG GAA CAG CGA GAG CTG CCC AAC GGA CGT 1021 
Asp Ala Leu Pro Ala Gly Trp Glu Gin Arg Glu Leu Pro Asn Gly Arg 
300 305 



781 



829 



877 



925 



973 



1069 



1121 



ACCATGAGTC ACCCAATGGC TTCTTGAAAC GGTCCCTTTC TGCGGAGGTA GCATAGCACA 1181 
55 GTGACGTTTA TTCCGGGTCA CTTGATTGCT TTTCCTATCC ACTTACCTTA ATATTGCTCC 1241 
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CATGTCTTAG GACATATTAG AATTATTAGA AGATCTCTGG GAAACAAAA 12 9 0 

(2) INFORMATION FOR SEQ ID NO : 2 : 

5 

( i ) S EQUENCE CHARACTER I S T I CS : 

(A) LENGTH: 33 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

15 Met Ala Ser Ala Ser Ser Ser Arg Ala Gly Val Ala Leu Pro Phe Glu 
15 10 15 

Lys Ser Gin Leu Thr Leu Lys Val Val Ser Ala Lys Pro Lys Val His 
20 25 30 



20 



Asn Arg Gin Pro Arg lie Asn Ser Tyr Val Glu Val Ala Val Asp Gly 
35 40 45 



Leu Pro Ser Glu Thr Lys Lys Thr Gly Lys Arg lie Gly Ser Ser Glu 

25 50 55 60 

Leu Leu Trp Asn Glu lie lie lie Leu Asn Val Thr Ala Gin Ser His 

65 70 75 80 



30 Leu Asp Leu Lys Val 

85 

Gly Thr Ala Ser Val 
100 

35 

Lys Met Glu Asn Met 
115 

Gly Ser Val Val Ser 
40 130 

Thr Val Asp Leu Gly 
145 

45 Ser Gin Leu Pro Ser 

165 

Asn Arg His Gin Pro 
180 

50 

Thr His Arg His Ser 
195 

Glu Gin Ser Pro Gly 
55 210 



Trp Ser Cys His Thr 
90 

Asn Leu Ser Asn Val 
105 

Gin Leu Thr Leu Asn 
120 

Gly Gly Glu Leu Thr 
135 

Asn Val Pro Asn Gly 
150 

Arg Asp Ser Ser Gly 
170 

Pro Ser Thr Asn Cys 
185 

Gly Ala Ser Ala Arg 
200 

Ala Arg Ser Arg His 
215 



Leu Arg Asn Glu Leu Leu 

95 

Leu Lys Asn Asn Gly Gly 
110 

Leu Gin Thr Glu Asn Lys 
125 

lie Phe Leu Asp Gly Pro 
140 

Ser Ala Leu Thr Asp Gly 
155 160 

Thr Ala Val Ala Pro Glu 

175 . 

Phe Gly Gly Arg Ser Arg 
190 

Thr Thr Pro Ala Thr Gly 
205 

Arg Gin Pro Val Lys Asn 
220 
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Ser Gly His Ser Gly Leu Ala Asn Gly Thr Val Asn Asp Glu Pro Thr 
225 230 235 240 

5 Thr Ala Thr Asp Pro Glu Glu Pro Ser Val Val Gly Val Thr Ser Pro 

245 250 255 

Pro Ala Ala Pro Leu Ser Val Thr Pro Asn Pro Asn Thr Thr Ser Leu 
260 265 270 

10 

Pro Ala Pro Ala Thr Pro Ala Glu Gly Glu Glu Pro Ser Thr Ser Gly 
275 280 285 

Thr Gin Gin Leu Pro Ala Ala Ala Gin Ala Pro Asp Ala Leu Pro Ala 
15 290 295 300 

Gly Trp Glu Gin Arg Giu Leu Pro Asn Gly Arg Val Tyr Tyr Val Asp 

315 320 

20 His Asn Thr Lys Thr Thr Thr Trp Glu Arg Pro Leu Pro Pro Gly 

325 330 335 

(2) INFORMATION FOR SEQ ID NO : 3 : 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1005 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



30 



(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 
35 (B) LOCATION: 1 . . 1005 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

ATG GCA TCT GCC AGC TCT AGC CGG GCA GGA GTG GCC CTG CCT TTT GAG 
40 Met Ala Ser Ala Ser Ser Ser Arg Ala Gly Val Ala Leu Pro Phe Glu 
15 10 15 

AAG TCT CAG CTC ACT TTG AAA GTG GTG TCC GCA AAG CCC AAG GTG CAT 
Lys Ser Gin Leu Thr Leu Lys Val Val Ser Ala Lys Pro Lys Val His 
20 25 30 

AAT CGT CAA CCG CGA ATT AAC TCC TAC GTG GAG GTG GCG GTG GAT GGA 
Asn Arg Gin Pro Arg He Asn Ser Tyr Val Glu Val Ala Val Asp Gly 
35 40 45 

50 

CTC CCC AGT GAG ACC AAG AAG ACT GGG AAG CGC ATT GGG AGC TCT GAG 
Leu Pro Ser Glu Thr Lys Lys Thr Gly Lys Arg He Gly Ser Ser Glu 
50 55 60 



48 



96 



144 



192 
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CTT CTC TGG AAT GAG ATC ATC ATT TTG AAT GTT ACG GCA CAG AGT CAT 24 0 

Leu Leu Trp Asn Glu lie He He Leu Asn Val Thr Ala Gin Ser His 
65 70 75 80 

5 TTA GAT TTA AAG GTC TGG AGC TGC CAT ACC TTG AGA AAT GAA CTG CTA 2 88 

Leu Asp Leu Lys Val Trp Ser Cys His Thr Leu Arg Asn Glu Leu Leu 
85 90 95 

GGC ACC GCA TCT GTC AAC CTC TCC AAC GTC TTG AAG AAC AAT GGG GGC 3 36 

10 Gly Thr Ala Ser Val Asn Leu Ser Asn Val Leu Lys Asn Asn Gly Gly 
100 105 110 

AAA ATG GAG AAC ATG CAG CTG ACC CTG AAC CTG CAG ACG GAG AAC AAA 3 84 

Lys Met Glu Asn Met Gin Leu Thr Leu Asn Leu Gin Thr Glu Asn Lys 
15 115 120 125 

GGC AGC GTT GTC TCA GGC GGA GAG CTG ACA ATT TTC CTG GAC GGG CCA 4 32 

Gly Ser Val Val Ser Gly Gly Glu Leu Thr He Phe Leu Asp Gly Pro 
130 135 140 

20 

ACT GTT GAT CTG GGA AAT GTG CCT AAT GGC AGT GCC CTG ACA GAT GGA 4 80 

Thr Val Asp Leu Gly Asn Val Pro Asn Gly Ser Ala Leu Thr Asp Gly 
145 150 155 160 

25 TCA CAG CTG CCT TCG AGA GAC TCC AGT GGA ACA GCA GTA GCT CCA GAG 52 8 

Ser Gin Leu Pro Ser Arg Asp Ser Ser Gly Thr Ala Val Ala Pro Glu 
165 170 175.- 

AAC CGG CAC CAG CCC CCC AGC ACA AAC TGC TTT GGT GGA AGA TCC CGG 576 
30 Asn Arg His Gin Pro Pro Ser Thr Asn Cys Phe Gly Gly Arg Ser Arg 
180 185 190 

ACG CAC AGA CAT TCG GGT GCT TCA GCC AGA ACA ACC CCA GCA ACC GGC 62 4 

Thr His Arg His Ser Gly Ala Ser Ala Arg Thr Thr Pro Ala Thr Gly 
35 195 200 205 

GAG CAA AGC CCC GGT GCT CGG AGC CGG CAC CGC CAG CCC GTC AAG AAC 6 72 

Glu Gin Ser Pro Gly Ala Arg Ser Arg His Arg Gin Pro Val Lys Asn 
210 215 220 

40 

TCA GGC CAC AGT GGC TTG GCC AAT GGC ACA GTG AAT GAT GAA CCC ACA 72 0 

Ser Gly His Ser Gly Leu Ala Asn Gly Thr Val Asn Asp Glu Pro Thr 

225 230 235 240 

45 ACA GCC ACT GAT CCC GAA GAA CCT TCC GTT GTT GGT GTG ACG TCC CCA 76 8 

Thr Ala Thr Asp Pro Glu Glu Pro Ser Val Val Gly Val Thr Ser Pro 
245 250 255 

CCT GCT GCA CCC TTG AGT GTG ACC CCG AAT CCC AAC ACG ACT TCT CTC 816 
50 Pro Ala Ala Pro Leu Ser Val Thr Pro Asn Pro Asn Thr Thr Ser Leu 
260 265 270 

CCT GCC CCA GCC ACA CCG GCT GAA GGA GAG GAA CCC AGC ACT TCG GGT 864 
Pro Ala Pro Ala Thr Pro Ala Glu Gly Glu Glu Pro Ser Thr Ser Gly 
55 275 280 285 
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ACA CAG CAG CTC CCA GCG GCT GCC CAG GCC CCC GAC GCT CTG CCT GCT 
Thr Gin Gin Leu Pro Ala Ala Ala Gin Ala Pro Asp Ala Leu Pro Ala 
^ 290 295 300 

GGA TGG GAA CAG CGA GAG CTG CCC AAC GGA CGT GTC TAT TAT GTT GAC 
Gly Trp Glu Gin Arg Glu Leu Pro Asn Gly Arg Val Tyr Tyr Val Asp 

315 320 

10 CAC AAT ACC AAG ACC ACC ACC TGG GAG CGG CCC CTT CCT CCA GGG 
His Asn Thr Lys Thr Thr Thr Trp Glu Arg Pro Leu Pro Pro Gly 
325 330 335 

(2) INFORMATION FOR SEQ ID NO : 4 : 



912 



960 



1005 



15 



20 



30 



35 



45 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Asp Ala Leu Pro Ala Gly Trp Glu Gin Arg Glu Leu Pro Asn Gly Arg 
^5 10 15 



Val Tyr Tyr Val Asp His Asn Thr Lys Thr Thr Thr Trp Glu Arg Pro 
20 25 30 

Leu Pro Pro Gly 
35 

(2) INFORMATION FOR SEQ ID NO : 5 : 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 
CGTCGAAGTG CCACTACTAT AC 



22 
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<2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 base pairs 
> (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
AGTTCCAGAG TCTCAGG 17 
15 (2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRAITOEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

ACGTCACTGT GCTATGCTAC 2 0 

(2) INFORMATION FOR SEQ ID NO : 8 : 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

35 

(ii) MOLECULE TYPE; protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

40 Arg Leu Cys Glu Leu Glu Ser Pro Pro Pro Pro Tyr Ser Arg Tyr Pro 
15 10 15 

(2) INFORMATION FOR SEQ ID NO : 9 : 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(D) TOPOLOGY: linear ■ 

50 (ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

55 
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Arg Leu CyG Glu Leu Glu Ser Pro Pro Ala Pro Tyr Ser Arg Tyr Pro 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Arg Leu Cys Glu Leu Glu Ser Pro Pro Pro Pro Ala Ser Ai-g Tyr Pro 
IS 10 15 

20 (2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 
25 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO-11- 

30 

Pro lie Glu Thr Gin Lys Ser Pro Pro Pro Pro Tyr Ser Arg Leu Ser 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 12 : 

45 Phe Gin Met Pro Ala Asp Thr Pro Pro Pro Ala Tyr Leu Pro Pro Glu 
IS 10 15 

(2) INFORMATION FOR SEQ ID NO: 13: 

50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



35 



40 
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(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:. 13: 

5 Ser Asn Tyr lie Pro Glu Thr Pro Pro Pro Gly Tyr lie Ser Glu Asp 
15 10 15 

(2) INFORMATION FOR SEQ ID NO:14: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14 : 

Gin Ser Asn lie Pro Glu Thr Pro Pro Pro Gly Tyr Leu Ser Glu Asp 
20 1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



30 



35 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Phe Gin Leu Pro Ala Asp Thr Pro Pro Pro Ala Tyr Met Pro Pro Asp 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 
40 (B) TYPE: amino acid 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: peptide 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Tyr Asp Ser Leu Ala Gly Thr Pro Pro Pro Ala Tyr Ser Pro Ser Glu 
15 10 15 
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What is claimed is: 



1 . An isolated nucleic acid molecule comprising a nucleotide sequence 
encoding EMU or a biologically active portion thereof. 

5 

2. An isolated nucleic acid molecule comprising a nucleotide sequence 
encoding a protein or a portion thereof, wherein the protein or portion thereof comprises 
an amino acid sequence which is sufficiently homologous to an amino acid sequence of 
SEQ ID NO:2 such that the protein or portion thereof maintains the ability to modulate a 

10 TGF-P response in a TGF-P responsive cell, 

3. The isolated nucleic acid molecule of claim 2, wherein the protein 
comprises an amino acid sequence at least about 70% homologous to the entire amino 
acid sequence of SEQ ID NO:2 

15 

4. The isolated nucleic acid molecule of claim 2, wherein the portion of the 
protein comprises a WW domain which is at least about 75% homologous to the amino 
acid sequence of SEQ ID NO:4, 

20 5, An isolated nucleic acid molecule comprising a nucleotide sequence 

encoding a full length human protein which is substantially homologous to the amino 
acid sequence shown in SEQ ID NO:2. 

6. An isolated nucleic acid molecule at least 15 nucleotides in length which 
25 hybridizes to a nucleic acid molecule comprising the nucleotide sequence of SEQ ID 

NO: 1 or to a nucleic acid molecule comprising the nucleotide sequence of the DNA 
insert of the plasmid deposited with ATCC as Accession Number 98375. 

7. The isolated nucleic acid molecule of claim 6 which comprises a 
30 naturally-occurring nucleotide sequence. 

8. The isolated nucleic acid molecule of claim 7 which encodes human 

EMIL 

35 9. The isolated nucleic acid molecule of claim 6 which encodes a 

biologically active portion of human EMU. 
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10. An isoUUed nucleic acid molecule comprising the nucleotide sequence of 
SEQ ID NO: 1 or the nucleotide sequence of the DNA insert of the plasmid deposited 
with A rCC as Accession Number 98375. 

5 11. The isolated nucleic acid molecule of claim 10, comprising the coding 

region of the nucleotide sequence of SEQ ID NO: 1 . 

12. An isolated nucleic acid molecule encoding the amino acid sequence of 
SEQ [D NO:2 or an amino acid sequence encoded by the nucleotide sequence of the 

10 DNA insert of the plasmid deposited with ATCC as Accession Number 98375. 

13. An isolated nucleic acid molecule encoding an EMH fusion protein. 

14. An isolated nucleic acid molecule which is antisense to the nucleic acid 
15 molecule of claim 1. 

15. An isolated nucleic acid molecule which is antisense to a coding region 
of the coding strand of the nucleotide sequence of SEQ ID NO:l. 

20 16. An isolated nucleic acid molecule which is antisense to a noncoding 

region of the coding strand of the nucleotide sequence of SEQ ID NO: 1 . 

1 7. A vector comprising a nucleotide sequence encoding EMI 1 , 

25 1 8. The vector of claim 1 7, which is a recombinant expression vector. 

19. The vector of claim 18, which encodes a protein comprising the amino 
acid sequence of SEQ ID NO:2. 

30 20. The vector of claim 1 8, which comprises the coding region of the 

nucleotide sequence of SEQ ID NO: 1 . 

21 . A host cell containing the vector of claim 17. 

35 22. A host cell containing the recombinant expression vector of claim 1 8. 

23. A method for producing EMI I comprising culturing the host cell of claim 
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22 in a suitable medium until EMIl is produced. 



24. The method oFclaim 23, further comprising isolating EMU From the 
medium or the host cell. 

5 

25. An isolated EMI 1 protein or a portion thereof which can modulate a 
TGF-p response in a TGF-fi responsive cell. 

26. An isolated protein or a portion thereof which comprises an amino acid 
10 sequence which is sufficiently homologous to an amino acid sequence of SEQ ID NO:2 

such that the protein or portion thereof maintains the ability to modulate a TGF-p 
response in a TGF-(3 responsive cell. 

27. The isolated protein or portion thereof of claim 26, wherein the portion of 
1 5 the protein comprises a WW domain which is at least about 75% homologous to the 

amino acid sequence of SEQ ID NO:4. 



28. An isolated Ml length human protein which is substantially homologous 
to the amino acid sequence of SEQ ID NO:2 

20 

29. An isolated protein comprising the amino acid sequence of SEQ ID NO:2 
or the amino acid sequence encoded by the nucleotide sequence of the DNA insert of the 
plasmid deposited with ATCC as Accession Number 98375. 

25 30. A pharmaceutical composition comprising the protein of claim 29 and a 

pharmaceutical ly acceptable carrier. 



31. A fusion protein comprising an EMI 1 polypeptide operatively linked to a 
non-EMIl polypeptide. 

32. An antigenic peptide of EMIl comprising at least 8 amino acid residues 
of the amino acid sequence shown in SEQ ID NO:2, the peptide comprising an epitope 
of EMIl such that an antibody raised against the peptide forms a specific immune 
complex with EMIL 

33. An antibody that specifically binds EMI 1 . 
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34. The antibody ol claim 33, which is monoclonal. 

35. The antibody of claim 34, which is coupled to a detectable substance. 

5 36. A pharmaceutical composition comprising the antibody of claim 34 and a 

phamiaceulically acceptable cairier. 

37. A nonhuman transgenic animal which contains cells carrying a transgene 
encoding EMU. 

10 

38. A nonhuman homologous recombinant animal which contains cells 
having an altered EMI I gene. 

39. A method for modulating a cell associated activity comprising contacting 
1 5 the cell with an agent which modulates EMI 1 protein activity or EMI 1 nucleic acid 

expression such that a cell associated activity is altered relative to a ceil associated 
activity of the ceil in the absence of the agent. 

40. The method of claim 39, wherein the cell is capable of responding to 
20 TGF-fi through a signaling pathway involving an EMIl protein. 

41 . The method of claim 40, wherein the cell is an endothelial cell. 

42. The method of claim 40, wherein the cell is an epithelial cell. 

25 

43. The method of claim 39, wherein the agent stimulates EMI 1 activity or 
expression. 

44. The method of claim 43, wherein the agent is an active EMI 1 protein. 

30 

45. The method of claim 43, wherein the agent is a nucleic acid encoding 
EMI I that has been introduced into the cell. 

46. The method of claim 39, wherein the agent inhibits the EMIl activity or 
35 expression. 

47. The method of claim 46, wherein the agent is an antisense EMI 1 nucleic 
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acid molecule. 

48. The method of claim 46, wherein the agent is an antibody that 
specifically binds to EMI 1 . 

5 

49. The method of claim 39, wherein the cell is present within a subject and 
the agent is administered to the subject. 

50. A method for treating a subject having a disorder characterized by 

10 aberrant EMU protein activity or nucleic acid expression comprising administering to 
the subject an EMI 1 modulator such that treatment of the subject occurs. 

5 1 . The method of claim 50, wherein the EMI 1 modulator is a small 
molecule. 

15 

52. The method of claim 50, wherein the disorder is a cardiovascular 
disorder. 

53. The method of claim 52, wherein the cardiovascular disorder is 
20 atherosclerosis. 

54. The method of claim 50, wherein the disorder is a proliferative disorder. 

55. A method for treating a subject having a cardiovascular disorder 

25 comprising administering to the subject an EMI I modulator such that treatment occurs. 

56. The method of claim 55, wherein the cardiovascular disorder is 
atherosclerosis. 

30 57, The method of claim 55, wherein the HMII modulator is a small 

molecule, 

58, A method for treating a subject having a proliferative disorder comprising 
administering to the subject an EMU modulator such that treatment occurs. 

35 

59, The method of claim 58, wherein the EMI I modulator is a small 
molecule. 
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60. The method of claim 58, wherein the proliferative disorder is a disorder 
characterized by uncontrolled proliferation of epithelial cells. 

5 61. The method of claim 60, wherein the epithelial cells are gut-derived 

epithelial cells. 

62. A method for treating a subject having a cardiovascular disorder 
comprising administering to the subject an EMU protein or portion thereof such that 

1 0 treatment occurs. 

63. A method for treating a subject having a cardiovascular disorder 
comprising administering to the subject a nucleic acid encoding an EMU protein or 
portion thereof such that treatment occurs. 

15 

64. A method for treating a subject having a proliferative disorder comprising 
administering to the subject an EMI 1 protein or portion thereof such that treatment 
occurs. 

20 65. A method for treating a subject having a proliferative disorder comprising 

administering to the subject a nucleic acid encoding an EMI 1 protein or portion thereof 
such that treatment occurs. 

66. A method for detecting the presence of EMI 1 in a biological sample 
25 comprising contacting a biological sample with an agent capable of detecting EMIl 

protein or mRNA such that the presence of EMIl is detected in the biological sample. 

67. The method of claim 66, wherein the agent is a labeled or labelable 
nucleic acid probe capable of hybridizing to EMIl mRNA. 

30 

68. The method of claim 66, wherein the agent is a labeled or labelable 
antibody capable of specifically binding to EMIl protein. 

69. A kit for detecting the presence of EMI 1 in a biological sample 

35 comprising a labeled or labelable agent capable of detecting EMIl protein or niRNA in a 
biological sample; means for determining the amount of EMIl in the sample: and means 
for comparing the amount of EMI 1 in the sample with a standard. 
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70. The kit of claim 69, wherein the agent is a nucleic acid probe capable of 
hybridizing to EMU mRNA. 

5 71 . The kit of claim 69, wherein the agent is an antibody capable of 

specifically binding to EMIl protein. 

72. A method for determining if a subject is at risk for a disorder 
characterized by aberrant or abnormal EMU nucleic acid expression and/or EMI 1 
1 0 protein activity comprising detecting, in a sample of cells from the subject, the presence 
or absence of a genetic lesion, wherein the genetic lesion is characterized by an 
alteration affecting the integrity of a gene encoding an EMIl protein or misexpression of 
the EMI 1 gene. 

* 5 73. A method for identifying a compound capable of treating a disorder 

characterized by aberrant EMU nucleic acid expression or EMIl protein activity 
comprising assaying the ability of the compound or agent to modulate the expression of 
EMI 1 nucleic acid or the activity of the EMIl protein thereby identifying a compound 
capable of treating a disorder characterized by aberrant EMIl nucleic acid expression or 

20 EMIl protein activity, 

74. The method of claim 73, wherein the disorder is a cardiovascular 
disorder. 

25 75. The method of claim 73, wherein the disorder is a proliferative disorder. 

76. A method for identifying a compound which binds to EMI 1 protein 
comprising contacting the EMIl protein with the compound under conditions which 
allow binding of the compound to the EMIl protein to form a complex; and detecting 

30 the formation of a complex of the EMIl protein and the compound in which the ability 
of the compound to bind to the EMIl protein is indicated by the presence of the 
compound in the complex. 

77. A method for identifying a compound which inhibits the interaction of 
35 the EMI 1 protein with a target molecule comprising contacting, in the presence of the 

compound, the EMIl protein and the target molecule under conditions which allow 
binding of the target molecule to the EMIl protein to form a complex; and detecting the 
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formation of a complex of the EMI 1 protein and the target molecule in which the ability 
of the compound to inliibit interaction between the EMI 1 protein and the target molecule 
is indicated by a decrease in complex fonnation as compared to the amount of complex 
formed in the absence of the compound. 

5 

78. The method of claim 77, wherein the target molecule is MADR6. 

79. The method of claim 77, wherein the target molecule is MADR7. 

1 0 80. The method of claim 77, wherein the target molecule is a complex of 

MADR6 and MADR7. 
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TGCGGGCGGT GGAAGGCGGA AGTAGGAGAG GAGTTCGGCG CCGCTTCTGT GGCCACGGCA 



60 



GCTTCACGGT GATGAT ATG GCA TOT GCC AGO TCT AGO CGG GCA GGA GTG GCC CTG CCT TTT 121 

MASASSS RAGVALPF 

GAG AAG TCTCAG CTC ACTTTG AAA GTG GTG TCC GCA A AG CCC A AG GTG CAT AATCGTCAA 
EKS QL TL KVV S A K PK VHNR O 



15 

181 
35 

CCG CCA ATT A AC TCC TAG GTG GAG GTG GCG GTG GAT GGA CTC CCC ACT GAG ACC AAG AAG 24 1 

I'RINSYVEVAVDGLPSETKK 55 

ACT GGG AAG CGC AIT GGG AGC TCT GAG CTT CTC TGG AAT GAG ATC ATC ATT TTG AAT GTT ACG 304 

TGKRI GSS ELLWNEII ILNVT 76 

GCA CAG AGT CAT TTA GAT TTA AAG GTC TGG AGC TGC CAT ACC TTG AGA AAT GAA CTG CTA GGC 367 

A QSHLDLKVW S CHT L RNEL LG 97 

ACC GCA TCI GTC AAC CTC TCC A AC GTC TTG AAG AAC AAt GGG GGC AAA ATG GAG AAC ATG 427 

TASVNLSNVLKNNGGKMENM !17 

CAG CTG ACC CTG AAC CTG CAG ACG GAG AAC AAA GGC AGC GTT GTC TCA GGC GGA GAG CTG 487 

QLTLNLQTENKGSVVSGGEL 137 

ACA ATT TTC CTG G AC GGG CCA ACT GTT GAT CTG GGA AAT GTG CCT AAT GGC AGT GCC CTG ACA 550 

TIF LDGPTVDLGNVPNGSA LT 158 

GAT GGA TCA CAG CTG CCT TCG AGA GAC TCC AGT GGA ACA GCA GTA GCT CCA GAG AAC CGG 610 

DGSQLPS RDSSGT AVAPENR 178 

CAC CAG CCC CCC AGC ACA AAC TGC TTT GGT GGA AGA TCC CGG ACG CAC AGA CAT TCG GGT GCT 673 

II Q PPSTNCFGGRSRTH RHSGA 199 

TCA GCC AGA ACA ACC CCA GCA ACC GGC G AG CA A AGC CCC GGT GCT CGG AGC CGG CAC CGC 733 

SARTTPATGEQSPGARSRHR 219 

CAG CCC GTC AAG AAC TCA GGC CAC AGT GGC TTG GCC AAT GGC ACA GTG AAT GAT GAA CCC 793 

Q PVKNSGHSGLANGTVNDEP 239 

ACA ACA GCC ACT GAT CCC GAA GAA CCT TCC GTT GTT GGT GTG ACG TCC CCA CCT GCT GCA CCC 856 

TTATDPE EPSVVGVTSPPAAP 260 

TTG AGT GTG ACC CCG AAT CCC AAC ACG ACT TCT CTC CCT GCC CCA GCC ACA CCG GCT GAA 916 

LSVTPNPNTT SLPAPATPAE 2S0 

GGA GAG GAA CCC AGC ACT TCG GGT ACA CAG CAG CTC CCA GCG GCT GCC CAG GCC CCC GAC 976 

GEEPSTSGTQQLPAAAQAPD 300 

GCT CTG CCT GCT GGA TGG GAA CAG CG A GAG CTG CCC AAC GGA CGT GTC TAT TAT GTT GAC 1036 

ALPA.GWEQRE LPNGRVYYVD 320 

FIGURE I 
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